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PREFACE 


Recent advances in statistical method have given the research 
biologist a new and valuable weapon to aid him. in the accurate 
interpretation of his data. Statistics is a branch of applied 
mathematics, _and_a c omprehensive understanding of the theo ry 
is possible only to th ose of a mathematical turn of mind . In 
consequence, a full appreciation of the fundamental mathematics 
must remain the prerogative of the few. On the other hand, 
there is no reason why those who are interested in statistics solely 
as an aid to scientific research should forego the advantages of ; 
applying statistical methods to research data derived from any f 
standardized experimental design. The interpretation of the 
results by means of the appropriate statistical formulas should 1 
then become a purely routine operation. The available literature 
is rather technical in character, and several years’ experience 
with postgraduate students has shown that there is a real demand 
for a more elementary exposition of statistical technique.' This 
book is an attempt to meet this demand and is essentially the 
detailed analysis of data from a representative series of experi- i 
ments typical of some of the commoner statistical problems j 
encountered by the average research worker. In certain exam- ^ 
pies, the original data have been slightly simplified in order to | 
m.ake it easy to follow the successive stages in the arithmetical | 
calculations. It is hoped that these representative examples ;;f 
may serve as a practical guide to the research worker in the desi^ | 
and interpretation of his experiments. For the student who has | 
to study the subject more deeply, they may even help toward an [ 
easier understanding of statistical theory as expounded in the I 
more technical works. I 

In compiling this handbook, the writer has made liberal use of j 
the relevant literature. To the authors of the various theoretical ^ 
and practical memoirs consulted, he wishes to express his deep ' 
indebtedness. In particular, grateful acknowledgment is made 
to Professor R. A. Fisher and to his publishers, Messrs. Oliver' ^ 
and Boyd, for permission to reproduce some of the fundamental [ 
tables Wb. ^^Statistical Methods for Research Workers.” , i 







CONTENTS 



Pbbfacb 


CHAPTER I 

General Principles 

> ‘tJalculation of Mean and Standard Deviation 

Normal Curve of Error . . » 

wiStandard Error | . 

Statistical Significance , 

%^mpiing M . . . 

Analysis of Small Samples . 

Analysis of Correlated Samples — ^‘Student’s"' Method 
WOoefficient of Variation . . 

. /Probable Error. ' . 

Short Methods of Computation , 

%-Basic Formulas 


CHAPTER II 

Analysis op Variance 

^"Analysis of Variance in Its Simplest Form , . . . 
i The F Test for Comparing Component Variances 

The z Test 

Affinity between the 2 and ^ Tests. 

Interactions 

Direct Calculation of an Interaction 

Analysis of Data Divided into Subunits .... 

A Complex Experiment 

Experimental Precision 

Useful Formulas in Analysis of Variance. . . . . 


CHAPTER III 

Goodness op Fit and Contingency Tables 

^The Chi~squared Test (x^) 

Binomial Distribution. * ^ 

Contingency Tables. „ . . . ^ . 

Problems in Genetics 

^ CHAPTER IV 

Di40RAi%iirAi . t' , , * 



CONTENTS 


^ . Evaluatioia of Standard Deviation from a Frequency Table 

Correlation D^rams 

CHAPTEE V 

ObmtftiATioN 

Calculation of a Correlation Coefficient 

Significance of a Correlation Coefficient 

Easy Methods of Evaluation 

Statistical Comparison of Correlation Coefficients. . . . 

Partial Correlation . * . . , 

tntraclass Correlation 


CHAPTER VI , 

AlWmMsiOK 130 

; , . / |lBstimation of Coefficient of Regression 131 

* ‘ , I Significance of Regression Function 138 

t ' I Cbmparison of Independent Estimates of Coefficient of Regression 142 

^ Linear Regression Component of Variation .144 

Reduction of Error Variance by Means of Regression 146 

Analysis of Covariance i 148 

iiA ^ Tc»t for I%earity of Regression Line 152 


CHAPTER VII 

» 

.fati^uction . 

'"f^tral#3dncip!® ^ " 

!lS!ock Layout 

®^iin’'^uare. 

‘Oinenalhsation of Results 

Orouping of Treatment Comparisons ........ 

. * t. . 

Records 

CHAPTEE VIII 

Aim F»t»OTlAL Ceof Exfkbimunts 

Swil ^Experiments 

a^riinente mth Perennial Crop 

Aaalyida of Data from Perennial Crops, 
of^^uped Date When Different Assumed Means Are 


CHAPTER IX 

tfidlUTOS W PiBIa) ExPEKMEirTATIOir 





CONTENTS 


Linear Eegression Component of the Treatment Variance . 

Confounding of Treatment Effects . . . 

Subdivision of the Treatment Responses in a 3® Experiment 

A3® Experiment without Replication 

Complex . Confounded Designs 

Valedictf y Remarks 


j Selected Bibliogeaphy 


-PPENDIX — StATISTI;jAL TaBLES ’ 

I. Table of re 

II. Table of L 

III. 5 Per Cent Points of the Distribution of 2 . . . . 

IV. Table of -2 

V, Napierian Logarithms 

VI. Table of P .*.*.*; 

VII. Table of Number of Replicates Necessary to give Significant 
Differences * 






1 womid go furtker and insist that ail biological 
sstigation involves a statistical considera|;ion of the 
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fiWp between the variables^ it is necessary to tabulate the variate 
for A against the corresponding one for B, e.g,, brother ^--s. sister 
tad modify the statistical procedure in accordance with the 
pethod evolved by “Student.” As the chichs in Series B 
(Rxarni>le 3) were known to be the offspring of the same parents as 
thorn.' in Series .-1 in the order shown from 1 to 10, these data may 

method of analysis. The first 

-e same family, the 

1 rearing the 

■rences to evaluate 
nee directly. 

i 4.— Determination of Significance of Mean Difference 

'%y StodeiitV^' Method* 

Table 4 

Wa%ht of chicks. PWe>-«nccmj 

wriightbe- 

, ■’ ! tween Mean 

I difference 

Series | parentage, 


TO Used io illystrate ^‘Sludent^g^^ j 
step is to tahulate, for pair of cliieks of the 
difference in weight due io the alteraative iiiethodH of 
birds aiul Ihen from these individual di.ffe 
the standard error of the mean dlffere 


I Dciviation 
j from mean 
I difference 


Square of 
deviation 


diffamifees = 

wwr trf ma^ d^erences 


'i**; 2.857 
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a forreet idea of th.e relative dispersion of the different variables 
directly from the calculated values of the standard deviations. 
The units of measurement may be entirely different in the various 
experiinents^j and the figure for -the standard deviation must be 
considered in relation to the size of the mean from which it has 
been detcriiiinecL For examploj a deviation of 2 from a mean of 
10 is c 3 xaetiy equivalent, as regards variation, to one of 8 from a 
irieaii of 40. For comparative purposes it is customary to express 
tJie standard deviation as a percentage of the mean from which 
it luis been calculated. In this form, it is termed the coefficient 
of Duriai/ion. ^ 

The data from Example 2 have been used to ealeuiatc the, 
eoelHeieni of variation in the two series A and J5. 


Series 

Mean. 

Standard 

Coefficient of variation, 

height, in. 

deviation 

per cent 

A 

■OS,- 

3.55 

X lOO -5.22 

B 

66.5 

2.34 

XlOO =3.52 


Tlie dispersion of t|ie variates round the mean is therefore 
distinctly greater in Series A than in Serie>s B, 

PROBABLE ERROR 

This is a statistic that was formerly used as the measure of 
the dispersion ’ of the variates round the mean. Its value is 
0.(57449 X standard deviation, calculated in the ordinary way. 
11 ic probable error is such that, in the normal curve, ordinates 
raisc‘d at deviations from the mean equivalent to plus and minus 
the probaJ:>le .erxox..diyide the curve along with, the mean ordinate 
inioTollFequal se(dloi,xs"6f"quaf!;^Xs7”‘"^^^ are therefore 

termed quartiles. In determining the significance of a differenofe 
bfjtween mean values, the normal criterion is twice the standard’ 
errors this is roughly equivalent to three times the probable error' 
of the .mean. The term,, probable error, is rather misleading 
as the quantity does not represent 'the most probable' mistake 
likely to occur in any series of’obseryatjllis* Fisher states that 
its only recommendation is ilsfrequenl'we. Tod^.v u. 
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the probable emir ba.s beeta mada' here only because it occurs 
frequently' in nniny of the older books on statistics and may 
cciriFec|!ieiit!y crcfate some eojifusion among Hiudents not familiar 
with thf* teriiL 


SHORT METHOBS OF COMPUTATIOM 

When fhr- fiinnhrruf variates is limited and tlie observations are 
recorded n> intc'gerh eontainiug tmly one or two digits, and the 
mean is i\ n!iek‘ mimbeia the direct method of calculating the 
stamlwl cie\'i,'itiyri f)y squaring and summing the individual 
.devialicnis iloc*s not entail an excessive amount of arithmetic. 
In most stafi^'lica! |}roblems% however, the nuinlxT of readings is 
large, the mean is rarely an integer, amJ the va.riates often include 
three or more digits. Under these eircuiiLstances, the routine 
arithfiicdie can bc! grealiy rc'dueed by modifying tlie aritlmietical 
technique. It is proposed to deseribe some of these alternative 
iiiethocls of ecmipiitaticm and to indicate the type of data to which 
each one is parti<?ulariy appropriate. It should be clearly under- 
stood that it is only the arithmetical procedure that is changed 
' and that the final i^stimate of any statistic! will not be altered. 

, These alternative methods must not be regarded as providing 
mere approxiniatioiis to the desired values. In fact, when 
the mean is not a whole number, they may even .eliminate the 
factional errors tliat would otherwise be u!iavoi4a|j[e in tabulat- 
ing the deviations, and tend therefore to be more 'rather than 
' tesi accurate than tlie direct method. It is proposed to use the 
‘tel#tiv 0 !y simple* data from Example S to exemplify some of the 
short ntfetliodg commonly adopted in siatigtical eomputation, 

Eimaipte B.'^-Assumed-metii Method of Calcukting Standard 
‘ Deviation. 

Prm:e4iMre.~Imtemi of ealeulating the true n^an of the 
^ variates, an iipproximate or assumed meai\ is selected arbitrarily 
from a rapid survey of the data.^*^roin this assumed mean, the 
diyiAticitis and deviation! separed aj^ invaiuated ,and suiipied, > 
"ATaeiitetes the _ rapid and aacumlf estimation, of tfe^'|evia- 
ft wholt iiuinyr, |^feahly a multiple of 10, m 

^^mumM mmm 'spumed mean need not ao^Wily 
rprfimilft- triwlieak, bill on the other hanci,4he, closer 

er to the. aggiegate^ill be the 
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dcviiitioiiR and the squares of these deviations* . Unless the 
assumed happens^ to coincide with the value of ttie triie 

meal!," the sum of the deviations as totaled in column II I ..wijl 
not Ik:» aero but may take any value above or below zero. |The 
true mean is equal to the assumed mean + the algebraic sum of 

1'able G. — Assumed Meak Method of Calculation 


Deviations from 
assumed mean 


Square of 
deviations 


Mean = Ma + 


<r a= =s 2.45 (as originally calculated) 

the deviations in colunan III divided by the total number of 
v ariate s. A useful check on the arithmetic can be obtained by 
calculating the true mean iirectly from the original data. The 
aiuatus. the i-4byii|,tions from the assumed mean 
(eolumtf IV) requires eorrec&n; Wore; the true standard 
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tho probable' error has Iwn made here only because it occurs 
frequently in many <jf the older books on statistics and may 
consequently (froate some confusion among students not familiar 
with the term. 


large, tliemeai) is rarely an integer, and the variate.s often include 
thme or jnore digits. Under these circum.stanees, tiic routine 
arithmetio can b<> greatly reduced by inotlifying the arithmetical 
techniqut?. It it, jiroposed to describe some of these alternative 
methods of computation and to indicate the type of data to which 
each one is particularly appropriate. It should be clearly under- 
Btoml that it is only the. arithmetical procedure that is changed 
and that the final e.stima.te of any statistic will not be altered, 
ihesf* alternative methods must not be regarded as providing 
mere approximations to the desired Values. In fact, when 
the mean i.s not a w hole numk^r, they may even eliminate the 
pactional errors that would otherwise be unavoi(|afeie in tabulat- 
ing the deviations, and fend therefore to be more rather than 
le^ Wciirafe than the direct method. It is proposed to use the 
«l*ct!ve!y simple data from Example 8 to exemplify gome of the 
Short irfethods commonly adopted in .statistical computation. 

®-~^®®““ed-mean Method of Calculating Standard 

Prwedurc.—Instead of calculating the true n\ean of the 
vanates, an approximate or assqnied meaq is selected arbitrarily 
tom a rapid survey of the data.^rom this assumed mean, the 
^viarions and deviation* sqjiared ar^ts. evaluated $nd sumwed 

accural^ estimation of fle^^devia- '' 
»«» if a whote number, inferably a multiple of 10, is chosen as 
l^iiined mew, T|^‘ ^wimed mem need not 

“thw hand, the'closer ^ 
tttte ^ th«;smaJIer in the ^i«ate.!wiU'he tte, , / 
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deviations and the squares of these deviations. Unless the 
as.sumfid happens to coincide with the value of tHe true 

mean, the .sum of the delations as totaled in column II I will 
not fje zero but may take any value above or below zero. pThe 
fruf! mean is equal to the assumed mean + the algebraic sum of 

Taiii.k f ).— .\hsvmbd Mkan Method op Calculation 


Square of 
deviations 


Deviations from 
as«ttmcd nienii 


0 w 2.46 (as originally calculated) 

the deviations in column III divided by the total number of 
va riate s. A useful check on the arithmetic can be obtained by 
osculating the true mean lirectly from the original data, rhe 
suin,. 5 ^,the squares 6f the. deviations from the assumed inej^ti' 
(coltittl IV) also requires correetibn ' before the true standard 
deViaUbh'can be evaluated. Th^(>i*eetion coii»^-i#;-eubtraet-' 
mg a’^quantity; equivalept to tte ^iare of, the sum M ,:%e - 4evia^ 



tr = = 2.21 (as originaiiy calculated) 

Pj^«re.-The mean is calculated by the ordinary method, 
wh yanate is then squared and the sum of th« 5 e squares entered, 
j the a^uracd mean is «ero, the variates represent the d^viaibas 
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This varia?)le-sc|iiared method is of 'beiieiSt when the mean is 
jtii decimal and when the variates do not include more 

than digits. When the variates contain more digits than 

I Ills, iheir s<|ua,reH iiin to over six figures and are cumbersome 
fo work with; and then the assumed-mean method becomes 
firefuraljlc,. . . 

Decimal Fractioiis.~When the unit of measiirenient iioees- 
.vital os the iridiisioii of decimal fractions in tabuiatmg the data^ 
tiridi*sinihlc iiiacenra/ies in the routine arithmetic may be 
:i.voidod if the* variates axe nrultiplied by the lowest power of 10^ 
say Ifb* lliat will (‘liminaic the decimaL When the statistical 

it is easy to revert to the original' 
Hints by dividing ifie calculated statistics by the same power 
of If). li. is possildy advisable to add the rider tliat for statistics 
represerili'iig sciuared values, c.r/., the variance, the correct divisor 
will be 

Short Methods of Computing Standard Error.— riie key to ^ 
most tests of significance is the standard error or the standard ' 
error of the mean difference. In evaluating either of the>se 
statistics, it is generally advisable to leave the simplification of 
the preliminary statistical expressions to the end. Thus in 
Example 3, there is no need to work out the individual values of 
«j for the two series; the standard error of the difference between 
means is most easily calculated from the respective variances. 


En 


4 


M + 44 
9' X 10 


L04 (as originally calculated) 


In certain types of statistical analysis, the standard errors of 
die two means are identical. Under those conditions, 

Eij — \/2 X standard error (of either mean) 

or, assuming that the two samples have the same number of 

v'liriatei, n, , , ■ 


& 


4 


2 X variance of either variable 


In many problems, the total of the variates forms just as good ; 
a measure of type as the mean. It is often'''ampler to test the 
significance of a difference between totals instead of feetwfei^ 
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1"?','.™,’''?^° »= fKTOula tor the etand- 

ever' ;,"w ■: L pW '''““h- 

errer. of ,be tiuL, of Serlee T 


M X 10 

\ y~~ 


and 


f4:4 X IO 


il:!';: '«»-» Iheee „vo 


■/o4 ■’4” 44 

V“-i~ X 10 


10.4 


«afe ato U. ea«I, , ,, 'i^ZnTaZ, 

tliiit tli0 vhIiic^ of -j 4 1 I* 

... ” & talcuiated from the totals of each 

sf-fiM IS exactly the same as from tlie means and fhr. / t *• 

applied to the totals will therefore lead to t « ^ 

uitrciorc lead to the same conclusions. 

BASIC formulas* 

Standard Deviation. 

Sy Direct Method of Calculation. 

^ S.S. = 2(2, - ji/)2 f ‘ 

M n ~ 1~~ 

‘^Variance = _ _s 

n — 1 ^ 

W'here g.S. « ^nm of squares. 

V « standard deviation 
« - I = wmber of degrees of freedom, 

». By Aem^ned^man Method of Calculation. 

Let 

f 

A®ium(^ mean * M„ 


§ 




* tlio notatioB if aw M Swnpte t. 


n 


4/ 
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taking into account tlie sign + or — of ^.{y — Mu) 


ricrt' M tnran. 

M„ -- assumed mean. 

C.F. ~ c(»rrectioii factor. 

c. By Variahh-nquarc/j Method. 

r V - i-yy 


Standard Error. 

Standard error of a mcaii of n observations, 

E = — ^ / S.S. 

■s/n y n “Sj n(n — 1) 

Standard error of a total of n observations, 

Bt — cr X \/n = "v/varTance”)^ 

Standard Error of a Difference. 

a. General. 

Standard error of the difference between the means o] 
samples A and B containing m and Wj observations, respect 

Eij *= ■v/eT’+W^ = /y^rianc e of .4 variaiTc^'of B 


When stjp,ndard error of each sample is the same, i.e. 
'a * ** E, then 

V S’ _ Variance 


Standard error of the difference between the totals of t 
samples j 4 and B, > • 
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b. Correlated Series . — If yi and jfj represent corresponding 
variates in two correlated scrios, each containing n observations, 

Mean, difference, D = 


Standard error of the mean difference, Eo 
By variable-squared method of ealenlurion 


The statistical analysis of a very wide range of experimental 
data depends on the correct application of these fundamental 
formulas. The elementary student of statistics should make 
himself thoroughly familiar with their application to the simple 
examples cited before preceding to the more advanced sections 
of the book. Without this preliminary grasp «f the geaera!' 
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aiialtsIS of variance 

•ij>!i‘r till* calculations have )x‘pu limited to flip 
<‘-*“Pan.son of statistics from not mol 
-ne.s. It is seldom that rZ!^! 

r viri.r,?' ■"»"y distinct series or 

..i.jsivLS„tCt''dc£,^ 

wmbined readmes from oH . ^toli&tical 



deiiiaml 

Mean 
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The deviations in the latter calculation are not deviations of 
single variates but of the means of 10 variates. To obtain a 
value ropre.senting the aggregate dispersion of the 10 variates 
in each .«aniple, it is nece.s.sary to multiply by 10 the square of 
the de\'iation.s as calculated fronn the means. The sum of squares 
bctcweji series is therefore 2 X, 10 = 20. As only two deviations 
were ii,sc‘d in its' determination, this sum of square.s has only 
1 degne of fnedom. The complete analysis of variance can now 
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The mthin-series variance measures the unavoidable variation 
between similar units in the material or population from which 
the data have been collected, and the square root of this value 

standard deviation. As it is 
from the standard deviation that the standard errors of the means 

o sclies are calculated, the within-series variaiicG is generallv* 
termed the error variance. Thus 


Standard error of the mean of Serias A or i? = —~ 

O. , , ' A 10 

htandard error of the difference between the means of A and B 


To be significant, the difference between the means mu 
greater than t X 1.04 = 2.101 X 1.04 = 2.185, where t i 
rei^mg Irom the Table of t (Table 11 in Appendix) for P = 
and n = 18, ie., for the number of degrees of freedom of the 
vananee. As the; difference between the means of A and 
only la 11 or 2 ounces, it is not agnificant. ' " 


■ .Mietor 

s.a 

Degrees of 
freedora 

1 ^ 
Yariauco, ie,, 

degrees of freedom 

Total, , , ■ ! 

1,18 

20 ! 

98 j 

19 

1 

18 

20. 0 

^ 5.44 ■ 

Between series... 1 

Witbiii series, icj., error { 

i 


i 
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Tbe following points ?^hould be noted, Tlie whole of the 
available dat-a have been used to provide an estimate of the 
ataiidarcl clevijitioiij which can be validly applied to determine 
tbe standard errors of the means of any of the eoroponeiit series. 
As il'icse series contain the same number of variates^ their 
slaiidard mom are identical^ and the standard error of the differ- 
ence betw(?cn the means is therefore X the standard error 
of any one. h'urlherinorey the aggregate sum of srpiaros and the 
aggregah* degrees of freedom must bc3 exactly c‘qual to the total 
Slim of squares and degrees of freedom, as independently evalu- 
ated. This fiirras a uscdiil method of checking the arithmetic. 
The hf‘twceii-sm‘ies variance is often termed thc3 tmifmoii variance^ 
as it is lhc‘ result of differences in treatmenl, either natural or 
artificiah to which the groups of variates have been subjected. 
Whcffi tlic treatments are complex, the treatment variance may 
ill tnni liave to be split up into so many component variances in 
order to complete the analysis of the data. 

The short methods of computation can be used with advantage 
in the ealcuiation of the various factors in the analysis of variance. 
Using the same data, the evaluation of the total and the treat- 
ment sums of squares by the variable-squared method is 
given in Talile 10, and the assumed-mean method has been 
adoptt?d in I he next example. 

Tlie only point in the calculation of Table 10 that might require 
further elucidation is the division of the sum of the squares of the 
treatment totals by 10. In Series A , there are 10 variates belong- 
ing to a group having an average weight of 13 ounces. For the 
series as a whole, irrespective of the dispewsion shown by the 
inrliviclual readings mathin tlie group, the sura total of the 
squares of tlie variates is 10 X KF. Tiiis is equivalent to 

( Ti\ £ T® 1 10^ 

id) “ ITl '"'i'o"’ ealculated. Similarly for Series B, 

the rraiiiired sum of the squares of the variates for the series m a 
whole is 


The division by 10 is therefore 'merely a correction, for the fact 
that the aggregate, and not the individual values, of 10 variates 
_baa been used in calculating The squares. Tbe process i$ a 
'pariltel one to tbe mul^pEcation by 10 when the means the 
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Kprit’s lire' used in determining the deviations, as in the original 
caifuliJlions. 

T\ni..K ! 0 „— Ak'alysis of ?ariaxce by the Variable-squaeed Method 


S<'*.ries or 
trcatiiicnily 


Wt. of 

Square of 

i Treatme] 

cliieks, 025. 

variates 

j total 

tv) 

<v=) 

i (7’<) 

9 

SI 


17 

2S9 


14 , 

196 


13 

169 


15 

225 


10 

100 


11 

124 ■ 


13 . 

169 


13 

169 


15 

225 

130 

8 

64- 


15 . 

■'225 


11 ■: 

121 


... 1 1 

121 


.:'9 . ■ 

81 


12 

. 144, 


11 

121 


10 

100 


9 

81 


14 ■ i 

196 

• . . 110 

240 1 

2,998' 

.240.. 

i 


Total S.S. 
Treatment S.S. 


(Ml 

■ n 
240^ 

^ - 2,880 

2,998 -- 2,880 

118 

29.000 _ 2 88 < 


Square of 
treatment 
total 


16,900 


12,100 

29,000" 


2,880 = 20 


The division of the total variance to its two components — 
the within- and the between-series variances — exemplifies a 
simple but very common form of the analysis of variance and one 
that can be applied to a wide range of experimental observations. 


-iff... ' .1 


'■r: 
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Variance . 


Tola! 

Trcatmciiiit 
Error 


Aa there are six varieties, there will be 5 degrees of freedom 

fcr ttie ^-atments. The within- variety or error sum of squares 

Prmdure.~To eliminate the decimals, the percentage figures 
from the chemical analysis have been treated as if multiplied by 
100. Deviations and delations squared from an as.sumed mean 
of 110 have then been evaluated for each variate and for each 
grass. 

Mean = + — (luiCsl = no 4- = 117.8 


C.F. = ‘ ^ ^ 73g_3 

Total S.S. == 2,318.0 - 736.3 - 1,581.7 
There are 12 readings in all or a total of 11 degrees of freedom. 
Betwcen-variety or treatment S.S. == - 736.3 = 1,466.7 

In calculating thi.s component, the total deviation of the two 
samples of each variety was squared; hence it is necessary, 
before subtracting the correction factor, to divide the sum of 

these squa.res by two* 'VV'. 



technique in agricultural research 

and final component and must bn pcpuil to the differ- 
I 581 7^- I'lVf trcatiuent .sum of .squares or 

?-i’s+ l-i dogre(!.s of freedom. In the 

indtqitndentiy.^ tomiwneiu iia« been cyaluated 

the f test for comparing component variances 

In any analysis of variance, if the treiUmmit v.nri..mr.r. k. 
.!d;ruv„n...U<v U.e^anie as or Irs.r&ThT^ 

; 1 : ;£:! is ,:-?. ; t ' 

that there is .some fundamental differenee imween’ ^ .■ ^ 

>■« tt rjx ,' 

JSErjjn™,, '"'y#'™"'-! I'y ^Iculalmg tb. mti„ 

sJiiaTjer \’ariariee^ ^ value geiAiIly denoted hy the letter F Tim 

troat........ varia... ^ . , 

SiH,'"!,"' " '“f a: l:: 

». »nd »; of ttnrT.X“,i'Tr "m whi h 

ofFhai been determiner} calculated value 

to the'readi„g’rf r»r,S,”„rt1r*‘““'^ > 

value.s of and where «, renresSf,^*!,® ^ appropriate 
of freedom of the f Presents the number of degrees 

iit4.qom Qt tlie larger variance. In the T^hU nf i ' , 

ef »i are tabulated along 'tha^top of tlk> i Ji i ’ t ^ values 

the left-hand side. The readine rmuirerf J \} 

, g re<|iiiiad is the one in the foliiintt ' 
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corrpsrioiidiitg to the number of degrees of freedom ui of the larger 
\'ariaiier‘ and on the line eorrespondiug to the number of clc^grees 
of frc*edoni of tlie smaller rarianee., A calculated value of F 
wliirli excixalK I he reading of F for P — 0.05 is sigiiificantj, but 
if ii fails to attain this level,' apparent differences between 
livatniefits must be regarded as; honsigiiificaiit and attributed 
to ei'fors of random sampling. Obviously, a calculated value 
cif F wlrhdi exceeds the appropriate reading forP == 0.01 is liighly 
sig!iifK*:uit,. A concrete exam|>le should enable the student to 
grasp wimt tliis tecliui(|ue involves in practice. 

Idir the data of Table 12, the required calculations would Ik* 
as follows: 


Variance 


The Talde of F (Appendix, Table VI) for Wi = 5, — 6, 

and P — 0.05 records a reading of F — 4.39 and for P 0.01 a 
reading of 8.746. The calculated value of F exceeds either 
of iiicse, 50 that the diffeence between the treatments is not 
only significant but is also highly signifi.cant, as determined on a 
|)rcdjabiUty (‘oiisiderabl}?' less than 0.01. 

As the F test has given a positive result, the error variance 
may now be us<‘(l to compare the mean nitrogen percentages 
cff the various grasses: 

=== Vl9^ 

Standard error of the mean percentage for each grass — 


Siaiidard error of the difference between two such means 

/19.2 X $ 


It is advisable at this stage to revert to the true units of 
measurement by dividing by the factor originally used to elimi- 
nate the decimals from the statistical cldculations. In this 
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exfimplfs the factor was 100 so ihot the staiaijird error of the 
diffcTeiice hetvveeii the nicatri isitrogc'n praTeiiiap's of the* various 
grasses is 0.0-I3S. The reading of I hir n - I] and P — 0.05 is 
2.447j aiiii ditfiTOices between treatimait ineaiis grc^aler tlioii 
2,447 X II.043S — Chl07 are therc4or<^ signifies ji I. 

This value has now 1o be used to assess the relaii\a? merits 
of the sL\ fodder grasses by eoniparirig the varietrd iiieaiis in all 
ptishibfe eorribiiiaiiousv two at a timcj in onh^r to deieriiiiiie the 
sigiiiriiaiiit diftcTeaief's. W!ien a uumhtT of fliffeneii tn^afiiaaits 
are eoiieeriied, ihU is lUf^st eadly (fleefod by labiilaliiie: the 
means ifi asf’midiug ordfw aia! enOTiug aloiii^>ide e^arh the* auioiiiit 
of the diffeima'e froju the previous value, tliiis: 


Mean nitrogfa'i, Dil'frr^rit'c* from, 
pro vio]is value* 


Pa.iVi grass ...... 

Id.ep!iant grass, . 
Giiiiiea grass. . . . 

Uba emui 

Guateiuala grass 
Coimbatore eane 


Any difFfTcnce or oiunidative differouce greater than 0.107 — the 
critical difference as already calculated — |)rovcs a .significant 
increase over varieties lower down on t he list. On this basis, 
Pnrd gra.ss i.s .signifieantiy better than any of the other grasses. 
Klejihant gras.s is better than the remaining four except gtiinea 
grass, which in turn is significantly better than the Guateinala 
gra.ss and the Coiinbaiore cane. There i.s no .significant difference 
between the la.st*thr(!e varietie.s li.«ted. 

Ano! hc'r very effective method of .summarizing the re, suits i.s to 
cx|jres.s the tretitment means ns a percamtage of any standard <jr 
eontrcil tn-alincnt. U.sing guinea grass here a.s the control, t.h(! 
result, .s, again arranged in a.scpnding order, might be expre^ssed as 
shown in Table 13. 

The standard error of each mean, a.s shomm in the penultimate 
column, has al go been expressed as a percentage of the control. 

Its value is = 2.58 per cent, A difference between the 

percentage pmter thati 2.58' X X 2.447 » 8.9 is sig- 
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il 



nifii'aiit, In moKt examples, espedally when the number of 
fh-prees of freedom of the eiTor variance exceeds 10, the criti- 
ca! lUITerence may be taken as equivalent to t.hree times the 
sininUu’d error of the treatraeiit mean, since the significant, 
difference is t X V^2 times the standard error ami / X a/ 2 is 
apinoximately three. It is now very easy to cias,dfy the treat- 

- T/VBim 13 .~Sitm-Mart op RESviiis 


Mean nifro- Mean, c; of ! 

gen, % j eoBtrol ; 


Fodder grass 


Very good 

Good to aver- 
age 


Pji n't grab's 

i'ilt.'phani grasfi. . . . . , 

€iH(tf*'d (jyinm grass 

Flirt «*a,ne. 

CoLih iti.nla graS’O, . . . . 
(Ifiiinhalore eane. , . . 


rrients into those significantly better, equal to, or worse than the 
control, as shown in the table. 

In some problems, there may be no convenient standard treat- 
ment, (»r the control may be so different from the rest of the treat- 
ments as to be unsuitable as a basis of comparison. Some 
authorities prefer to express the results as a percentage of the 
general mean of all the variates, as shown below: 

T-abub 14. — SoMMAEY or Results 


The standard error of the general mean, which is computed 
from all the variates, is less than the standard error of any 
treatment mean, and it is advisable to take this into account 
in comparing the various treatments with the ' geaesal mean. 


Fodder grass 

Moan nitro- 
gen, % 

Mean; % of 
general mean 

[ ■ ■ . 

Classification. 

Parjt grass, 

1.450 

123 . 

Very good 

Elefiliant grass. .j 

i 1.285 

109 

Good 

CiSuiriiett grass .! 

1 1.200 

102) 


(kjimd mtan . ...... I 

/ . /78 

irm 

Average 

Ulta vmw. . . . . . 

1.115 

95} ! 


Goatouiftla grans. . 

1.000 

90) 


Coiiribatore cane.,. 

1.060 


4 . 
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Frriiii tbf* analysis of Taria^nce table, the .standard error o! 

iTJ)2 

the goiioral is and remembering to reveii to the 

original units by dividing' by IOC),-' the critieal diilereiK^e fur 
comparing any treatment moan with general incaii is 

X 2.447 100 ^ 

' — 1 — y— ^ X ■ 6.9O5 expressed as a 

iK‘r«i‘ii!ago of Ihe gonerul mean. The elassifiealioii is mnvh 
saiiH^ UK that olilairied e,om paring the treatments witli the 
eojitfoh but the second method has made it possible to segregati’* 
the elophaoi grass into a class by itself, iateririediate hvi i he 
FanI grass aiul the avemge gra.de. 

Tfiese alternative methods of'elaboi*a.tmg significaii't di'iferenees 
have been diseus-^ed ad some lengtlp because experimtmlai reports 
are full of c\\ainpk‘S in whieh a valid ^laiistica! anal.ysi^ of the data 
has ber?ri effected, but the 'final srummary of the results h"*a'\T»s 
111 yell to be desired. The sole object of statistical evaluation is 
a.n act*iiratc and intelligible appreciation of the iriforination sup- 
plied by the data. Even in experiments inxmlviiiga. large miinber 
of treatment comparisons, a clear staterne.Bt of eoiicliisions should 
offer no difficulty provided an efficient . .teehiiiqucv is iisc3d for 
grading the treatment means in accordance .with the statistical 
tests. : , , 

THE s TEST 

The F test is merely a recent version of the older and more 
faniilia-r z test as inaugurated by Fisher, As there may be some 
read(U*s wiio have got accustomed to and prefer to use the older 
form, it is advisable here to give a brief account of the z test and 
to show liow iJie Tables of F may be derived from the Tables of z. 
FishcT^s z iH cjquivalent to half the differt-uiee bcivveen tlie 
Naph'riaiJ or iiyptudxdic logarithms*^ of the variances it is desired 
to compare, le., 

1 , / variariceA 

^ 2 \varjance2/ 

where varianepi is the greater, and the imml)er of degrees of 
freedom of the tm"o variances are ni and na, respectively. 

z is normally distributed, and tables have been compiled to 
show, for probabilities of 0.0*5 and 0.01, the theoretical value of z 
*** NapieriaB logariilims are tabulated in the Appendix (Table V). 
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for tliflVrf'iir hveh of ?h aiid m, A copy of the 5 per cent Table 
of r i- ri-profiiietHi in the Appendix (Table III). The variances 
are different when the calculated value of z exceeds 

rhf> ivatlim*; fniin the Table of ^ corresponding to the apipropriate 
of aoti lOi. lii o pplying. the ^/tesiTotbe data Table 12j 
the re»ifiireci caJcul'iiions would be as'iollows: 



i Degrees of 
j fre(idom 

1 Yark-Bee | 

i ' ■ 1 

InfT ^ ^ ealeiilaiioii, ie., 

. ■ rliffercmce hehYemi logs 

Viinanee ( n.. 

i 


d 0 (Ui) 

t ■ i 

293.3 i 
19.2 

1 i 

1 o,mi2 1 
; 2.95t9 j 

■1 

■ 1 , . 


L— J — . L,_. - 

< ,M( i7( f afr nih'tJneii ixi lh»» Appendix (Tablf* T). 

‘‘Ihe reading f n)iu life Tabh.H)f ^ for ni. = oaiid-n^ — Sis 0.7394. 
z };)y f?a!ctiia!ion is iiiiicii greater than thisj so that the difference 
bflwf'C‘ 1 ] the class means is definitely significant, whicdi is exactly 
I he corieltisioii previoiisl}^ obtained by the use of the F test. 

The Talile of F w'as originally compiled in order to ediniiiiate the 
ricfccssity of looking up the Napierian logarithms, a somem^'hat 
finicky operation. Now 


1 1 / v arianceA ■ 

2 ' \variance 2 / ■ 


and 


so lhat F represents the number whose Napierian logarithm is 
cifiia! \o 2z. Idir instance, the reading of z in the above example 
was 117394; twice tliis value is 1.4788. This last inimber is the 
Nnpierinfi logarilbin of 4.388, which is the reading of F obtained 
iarigioahj in apidyiiig the F test. The two tests therefore are 
bound in give iijcntical results. The F test is admittedly the 
siiii|)lf*r one hi apply. On the other hand, if the student is to 
keep aii fait with recent literature on agricultural research, it is 
im|)rjrt.aiit for him to be equally familiar with either method of 
proc‘ediire. For this reason, in the succeeding examples the z test 
has oecasiomilly !)een used in preference to the F test. - 
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'^MWimtY BBfWEEH THE z MB i TESTS 

It has been doinomt rated tbat the F and z tests are alternative 
methods of deteriuining whether ^le treatment groups of variates 
differ signific^antly from one anot her. When only two treatments 
are concerned, they niust lest whether there is a significant differ- 
ence between the two treatment means. But the significance of 
the difference between two treatment means may also be deter- 

mined bj‘ calculating t » "h- and comparing this value with the 

appropriate one from the Tabic of L It follows that, with a 
single pair of treatnumts, i and z (or F) are testing the same 
quantity I), and, if ntatistical melhods are to be regarded as 
efficient, they must give exactly the same ■ answer/ .. It, will be 
found that this is true in practice, so that, when only two treat- 
ment are concemed, the applirmtion of both testrS is a work of 
supererogation, as they arc bound to lead to precisely the same 
conclusion. In thes^j circumstanceB, the easier one to evaluate 
from the data should be used. 

As an illustration of the truth of these statements, it is pro- 
posed to test the following data, taken from Table 9, by all three 
methods (F, and 0* 


'Betw®ea-senes A and B 
Withia-»ri'©s A and B , . 
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The nearest reading from the Table of t k l.S^^^^pondiag 
to a probability of 0.2. By all three tests, tbe^'^^*eii^ of 
2 ounces between the mean treatment values is nonsignificant. 
Also, 1.1k3 excess of the readings of F and z over their fespective 
ealculated valties is of a degre<* that one would normally associate 
with a probabil j ty bet w^eeii 0. 1 and 0.2. It may safely be assumed 
that the tliree tests are in complete agreement. 

IHTERACTIOHS 

The division of the total variance to the treatment and error 
components is the simplest form of the analysis of variance. The 
experimental design has often to be made much more complex in 
character so as to test simultaneously the effect of several distinct 
treatriK^nt Kseries and their reaction on one another. In such 
comprehoDsivo experiments it is necessary to split up in the 
correct pro[)ortions, the total treatment variance among the 
various components to which it can be correctly allocated. This 
detailed statistical analysis is an essential preliminary to ap 
SMScurate appreciation of the factors responsible for any apparent 
difference between the treatments or combinations of treatments 
under observation. The statistical methods employed introduce 
no new principles; they represent merely an extension of the 
procedure already described in connection with simple data. 
There is however, one term— interaction — ^which possibly requires 
a little explanation. Its exact significance will be most easily 
comprehended^ by discussion of a concrete example. , Consider 
a field experiment in which the yield data for two varietaes 
wheat, A and Bi have been recorded for two I and II, the' 

first season being a wet one and the second a' dry one* If both^ 
varieties are types that thrive under dry weather eonthilons, 
higher yields in ttti second season than in the first could ■ 
expected and the percentage inerease in A woxxld be aii^oxi- 
mately the Sfllipe as in S, The relative difference between the 
varie^es would be maintained, the best one in season 1 f < 

ing its superiority in season H. In other words, the. response of 
the two varieties to the chan^ in climatic con^tiqis would be 
siinifaa*. On tibe otlrer if ^4 is a typethat does w# ift'dry , 
wea'&er, but B is one whicli ^ at its optimum in a humid fnwont- 
ment, the elMBBces are yield of A trill trtd tie yield 



.Ji*.‘ 
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considerably altered, and i| the 
?e in season is snfficientfy great, 
on may even become thejower 
inerea,se in A will be more or less 
e in B so t hat there will not be a 
e total yields for each seasorn 
vimvtles have reacted differently 
s, and in statistics, this difference 
itiaents to a change in a second 
u'hieh in this example would be 
on variety. In such cases, 
is is superimposed on a second, it is 
ration, not only the straightforward 
- - ’ he way in which the combina- 
In the above example, 


yirfds of the two 
"difference r- 
the better yielder 
jielder in^he secoiuL The 
compensated for by the <Jc‘cr< 
marked differenca^ be u ween 
Under these circinnstances, f] 
to a change of climatic eomliti 
in response in one seric\s of t 
Beries is termed the inirraetim 
defined as the mteracdlon of 
where one series of treatment 
neeesBary to take into nnmUhi 
treatment 

tions of treatments n‘a('t 

there are really 2 X 2 or 4 treat 


varieties will be 
in response (o the chan; 
in the first seas 


ooniparisons, but also tlir - 
on one another. 

- -..^tments, piz., 

Variety /I in Season I 
Variety A in Season II 
Variety B in Season I 
Variety /I in Season II 
In a eompletK' analysis of sneh data, the total 
the four treatments would have to be split up , 

a. That portion aseribable to differences bet 

and B. 

^ 6. mt portion a^ribable to differences betw 

c. That portion aseribable to differences re 
response of each variety to the seasons.'. 

fla are generally termec 

M they are evaluated from the average or total dif 
, erne aenes of treatments for all levJe 0 ? ™ 

ScriTor* inter<^m; in t 

interaction of season on variety. T^ f or z 

whethir there is a significant difference b 
t^d?’ 5hf mean yields for ■ 

in so fax as the averasB resnonoo «<• — u. . 
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of indicating this* Finally, a significant interaction variance, 
c, a-s determined by the F test, proves that the varieties have 
rcv^ponded differently to the change in season and permits a 
valid comparison of the mean varietal yields for eacA season. 
From this compariHon it should be possilde to specify which is the 
better variety to sow in each type of season. 

An inleraetion of this type is termed a first-order interaction; 
it shows how changes in one factor X react to changes in a second 
factor Y or vice versa. Where three distinct series of compari- 
sons arc^ Inmg tested, X, F, and Z, there will be firstrorder inter- 
actions of X on Yf X on Zj and Y on Z and also a second-order 
iriteraetion showing how X behaves under various combinations 
of Y and Z, how F nssponds to changes of X and Z^ or how Z 
resp<nuls to changes of A" and I"". The calculation and utility of 
intc^raediem variances are exemplified in the succeeding examples. 

Example 8.-- In a feeding experiment wdtli tropical dairy 
(‘-attle, 4(l cows known to be of approximately the same yield 
potentiality were divided into eight groups of five and one ration 
allocated to each group. The experiment was planned on the 
factorial system, a term denoting a design in which two or more 
scries of treatments or factors are included in all possible com- 
binations. In this experiment, four types of roughage were being 
tested in conjunction with two rates of concentrate ration, 
necessitating, on a 4 X 2 factorial arrangement, eight distinct 
treatment combinations. These are detail|d along with the 
results in Table 15. 

The error sum of squares measures the dispersion of -ihe yield 
data within the different groups of five animals fed on one 
particular ration. The eight individual rations are complex in 
nature resulting from the comparison, in a single experime^, of 
four types of roughage and two quantities of concentrate (0 and 
1). The sum of squares ferf rations should therefore be tesolved 
into its components, vis., that owing to differences in the n 

o. Eouglmge ration V. 

b* Couceutrate ration j j 

c. Interaction of concentrate with roughage. ^ .Ui; 

,f a o 90* + 110* + 115* + 166* 480^ 

S.H. rot^hagej,_ — ^ j — j-g — ^ 

= 306 with 3 degrees, of freedom : 
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242^ + 238» __ 180_= ^ 

S.S. concentrate = 20 40 

= 0.4 with 1 degree of freeflom 

The intemetion of roughag# and concentrate 
balance of the ration sum of squares and degre ;k 

Vs. interaction, = 342.8 - (308 + 0.4| * 

(concentrate X roughage) or 3 degrees o 

Tabl* 1-^v-Me ak Y1F.I.P OF Miok m Pints raa Day_ ^ 

1 


J.^gume 

silage 


J^etenee 
no. of animal 
in group 


Herbage 


Straw 


IWitk <s«>ae#atmtf5B, 
f coiteaitmtm. . ^ 

,Wi^ concentrates SW" 

Wii^lA <Kjne#|.tAteS^238; 


/tons 


variances 


8 ^ to a^rtaitt wheih< 

& effect on'tbe.resultoj. 

’ ...... ik.ei'f; 
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work of supererogation. The calculated values of F for the other 
two factors— roughage and interaction— “are greater ■, .i^han the 
corresponding theoretical values at the 1 per cent point, and the 
differences arc therefore highly significant. This mak^ it vaiiiS 
to use the i tost to compare (a) the milk yields obtained from the 

Table 16 .— Analysis of Variance 


DcgrecB P (by Table read- 

S.S. of free- Variance calcula- ing of F 
dom tion) (F « 0.01) 


30 

a im.7 40J18 4.51 approx. 

1 0.4 


Conceri irate 

Interaction: Concen- 
trate X ronglmge. 


5.00 4.51 approx. 


different types of roughage and (b) the proportionate yields from 
these fodders with and without concentratos. For the evaluation 
of the roughage, the comparable yields afe the totals obtained 
from the 10 cows fed on each of the fodders, irrespective of t|te 
concentrate ration used. These totals are as follows: 


Straw 90 4 

Hay..... :.r. 

Herbage. ^ '' 

Silage '!■ 

A dififflrenee betwee^toy two of these jbot^s greater Ihfn 

V^I 'Xi iO X 2 #4:14 pints ,fe signllcwi 

Str^ Is thtji>c»restihd,|&%e veiy||i£frk^ the Aughage 
d me four. ' This is generally" true whether or not concentratea 
in addition to t!ie’’ooaMe fodder. . 

' ft 6 nonfflgnifioant -siflijltAoe 'for quantity of concenti^^^ IftaA 
to tttO rather -ihexpeotesl^nclusion that the addi^n of eon- 
oetffcrate ^e wi&on h*ih not, on the cmrag,$^ resulted in ajr 
iaoBMafe'in yield. The dgnMcant xnter^tafe^^peiTOts the 
u*i'^|l&e..|i3^p.y]e^|nee to cdmiWire the foH(^ing treatment' 
iatei|b<5tiQf aw of WaS estimated: 
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Treatracut factor j 

Straw 1 

Hay 

Herbage 

Silage 

j 

With 000 cent rat 

46 

m 

61 

75 

Without, cooeent rat<5s. | 

i i 

50 

. , 54, ■ 

90 


^"Differences between iliese totals greater tlmn 

\/%Ty X ^ ~X 2 X 2 - 10 are signiiBeant. 

It would appear tlmt the addition of eoneentnites improves the 
value of the hay and herl)age relative to straw: without eon-' 
eentrates these two fodders just fail to givv a .signifi<axntly higiier 
yield than that obtained from straw* The legume silage, with or 
without eon(*entra1c‘s, is Jxdier than any of the other rations. 
Furthermore, with thc^ silage tln^ <Joneenirates actually depress 
the milk yitdd, preHiimably on account, of a ration too rich in 
protein. This fact also explains why in the analysis of variance 
the effect of conc(?ntrateH is apparently nil. With the first three 
forms of roughage, eoncentrateB tend to increase yields, but with 
the Bilage, they depress the yields; hence, in considering the 
average effect of concentrates for all four fodders, the variance is 
nonsignificant. 

This example effectively illustrates the advantage of examining 
two or more factors iix a single experiment and resolving the 
analysis of variance into its ultimate components. In this feed- 
ing trial, the inclusion of both roughage and concentrate has made 
it possible to ascertain which is the best fodder and also to show 
for each one the exact economy of adding concentrates to the 

DIEECT CALCmAHOH OF AN KTimRACTIOH 

When only two value® rt^presenting the same number of variates 
aim coneemad in the determination of any particular component 
y. d* the analysis of varmimet the sum of squares for tj^ factor can 
' 'be eaBily calculated directly from the difference between 
th^^yduea. If repmaents thfe total of all the n varfai» in 
th^test Juries and the comaponding total for the n variates in 
ty^’^ond mntBf then by the variablf^uared method, 

n + n _ {Ta + tbY 
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Tims in llic last example, 

S.S. for concentrates 


Tills method can bn apjilied to components of the analysis of 
variance in which more than two factors are concerned, provid^ 
tiie ioiais from an equal number of variates are taken in pairs in 
all comhinations, the differences between each pair squared, 
these' squares summed, and then the sum divided by the total 
numlier of variab's in the data. Thus 
H.B. roughage = 

(i>o - linn f (BD ~iis)’ + «M> - ) TO ‘ + <iio- iiw»+ mn- le.M’ + (ii s - w)* 

»W> (a« originaOy 

In exnmplc"8 of this type? whero a relatively large number of 
di!T<‘renef?H have to be eakmlatcd from an even number of totals, 
it is simpler to assess the sum of squares from the differences 
iK^lwt^en all combinations of these totals, two at a time. Tbu3 
S.S, roughage — 


This arithmetical technique can be further extended to cal- 
culate the interaction sum of squares between roughage and 
concentrates. The totals which determine the value of the 
interaction are tabulated below: 


Hay Herbage 


Treatment factor 


With eoaceatratcB. . . . 
Wittiout coo centra toB 
iMffarenee ..... 


The Interaction really teste w^heiher the addition of th%ri 0 on- 
f entrates to the ration has or has not had approximately thr^me 
effect in each of the four, fodders. If there is no intem^oa 
effect, the difference between straw wdth and without coacentriateB 
%vill ^ exactly the same m that between hay with and without 
.con^i^&ates. '‘-Comjparing these data, the addition of concnntral^'' 
has ehmged the yield from , ■ , '' ' 






TECHNIQUE IN AaRICVLTUHAL RESEABCIJ 


Treat nic'-iii factor 


Htrn.w 


With foiiceiiinilos. . 
Without eooeoii tra.i(^ 


lyiSvimmts i'K?l\voo.u tJit*.so tolrtls ^^roater than 

X 5 X 2 X 2 — 10 are sigriilic^ant 

It would appear lluil the aOdiHun of eoneeutrateH hupnm^s the 
value of the Imy and herbage itdative to straw: without eon- 
eeiitrates these two fod(ha\s Just fail t-o give* a signifita'iiitly higher 
yield than t.hal obtained from straw. The legume siltige, with or 
wilhoul eoutMUjlrates, is heitfT tlain aii}^ of the oi}i(»r ratiorrs. 
Inirilmrioore, with the* tlie eoiieeiit, rates a<d-iia.ily depress 

the milk yield, presumaljly on aeeouut i>f a ration too rich in 
protein. This fact also tjxplaiuH why in the analysis of \niriance 
the effect of eoneentrates is apimrently nil. With tla^ first three 
forrn.s of roughage, coiuarntrates tend to increase yields, hut with 
the silage, they depress the yields; henee, in eonsidering the 
average effect of concentrates for all four fodders, the variance is 
nonsignificant. 

This example effectively illustrates the advantage of examining 
two or more factors in a.singk* experiment and resolving the 
analysis of variance into its ultimate components. In this feed- 
ing trial, the inclusion of !K)lh roughage h!k 1 coiuienlrate has made 
it p«>ssibif^ to ascertain which is the best fodder and also to show 
for each one the oxm*t economy of ailding concentrates to the 
ration. 

mmet CALCUXATIOH OF AN IHTEEACTIOH 

When only two vii!uci^ri*preseniing the same number of variates 
are concerned in the dctr.*rmination of any particular component 
of the analysis of variance, the sum of squares for this factor (^an 


the^ values. If Ta. represents the total of all the n variates in 
the first series and Tb t he correB}30ndirig total for tlie n %wiates in 
the second aeries, then by the variable-squared method, 

Enquired = 
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■ ' C 242 '- 238 )^' ■ / ■ ' 

/S.S:/for'coruMiinites •, 

Thm nic 4 .iiO<l ran. l>e applied to eompoiienls of the analysis of 
variaiH:‘e in which more tlian two factors are concerned, provided 
I he totals from an vqnnl inirnber of variates are taken in pairs in 
all ' .coiiilanatloiiB,, the differences 'between randi .pair -squared, 
the'so/sqiiareB summed, and' then the, sum divided,; by -the- total 
number o,f- variates in the data. ■■ Thus 

m - iirr^?- 4 - fpo- -■ -f cm ni .turn's- {i lo 165 )® 4 - c,uj 5 -- ’t 655 -s.''^' 

: — 

305 (ot orii^iuany (fialfiutia.tpd) 

III examples of this type where a relatively large number of 
c.iiffef(meeH luive to be ealeulated from an oven muriber of totals, 
it is simpler' to., aasess' the, sum of squarcvS' from .the' differences 
]H!t\veen all combinations of these totals, twm at a time. Thus 
,S.S.: roughage: '== 

I(iKMU 10 )-«ni 5 f 1 fi 5 )P 44 atOmi 5 )--nUHU 65 ) 1 ®_ 44 (OOf 166 )~<I 10 + _ 


lids arithmetical teclmique can be further extended to cab 
ciilato the interaction sum of squares bottveen roughage and 
(HUtcenI rates. The totals wiiieh determine the value of the 
intenidiun are tabulated below; 


Tn'uUoc^at htclor 

With:-^ c0ri(e0O:b^.tes . ,. 
Wii.Ji 0 iit 'conceiitrateHvi'v: 


Iliiy Herbage Silage 


The interaciioii really tests whether the addition of the con- 
centrates to the ration has or lias not had approximately the same 
effect in each of the four fodders. If there is no interaction 
effect, the difference betw^eeii straw with and without concentrates 
wall be exactly the same as that between hay with and without 
coneentrates. Comparing these data, the addition of concentrate 
has changed the ^deld from 
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maEMmm jN MmauiTUMAL embeamcb. 


' ; .the' straw 'Tatioir liy 46' . 44;/— 2 v 

^ hay ration by 60 ^■■+I0.'um 

The effect of tin' coiicei!U\^iir* lias been much more marked in the 
ease (ff ttic hay in other words there is a differential 

■'■’response or interaction when the .influence of concentrates on the 
hay - straw- .■ ration's ' is^ c0inparedt^;;;/.The',.n.iagni'ttide' .■ .of,-’ this . 

interaction or diffcrraaa* in i*^^^pon^e is proportional t(^ 2 — 10 or 
SiiBilariy,r''l>y,' taking- the' fodders-, in' all":;tlie,- other:, 
■possible ■(miribinatipnSj t wont a thjiie‘-”""*straw':.aiid herbage, straw. 

and silage, liay and lN:*r!.>ag(n ei(5, -any difference in response to ,: 

concenlraU' ca,n be measured. The sum of the squares of these 
.widues': clivkied.-by. 4Ct''w^ be tinr:,i:id;eractioii.' sum of' squares. ■; 
:Thust,S.St:i.nteraclic^^^ - .; 

+ (2---7)2 4(2 - irt)24(IO-7)Md10- -15)2-4(7- 

== 37.4 (as originally calculated) 

The altennitive metliod of (calculation in which differences are 
assessed from the required totals taken in all possible pairs is also 
applicable. 

■■ ;S.'S.. interaetiom':.— 


{[(46 + 60) 
+■', ff (46 + 61) 
+ |[(46 + 75) 


(44 + 60)] -- [(61 + 75) -- (54 + 90)] P) 

(44 + 54)] - [(60 + 75) - (50 + 90)] pj 

(44 + 90)] - [(60 + 61) - (50 + 64)] p) 

202 + 142 + 302 


40 


40 


^:37;4' 


AHlhYSIS OF BATA BIVIBBO INTO SUBUNITS ' 

■■■:;p;. In;konte::ei£perimeiife,-|.ti4adwaiitagaous-/’to^'.split-upe^^^ 
into BO many Hulmnits in accordance with a second series of treat- 
^ ments— Series B — repreBonting Bubsidiary components of each 
of the variates of Series A. The statistical i^fanique has got to ,,, 
be modified if an accurate appreciation of the effect of all the 
different treatment factors anci of their reaction on one another 

Earampla 9.— As a test of the influence of the crop on the insect 
population of the soil, three soil types representing fallow, pasture, 
and orchard land were examined. Five soil mmples were selected 
at random from each, taken to the laboratory, and a census of the 
insect population made (Berwick^a data). ' :i*' 


I®' 
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Tabi:.e feipinjAi’ioK is Soft 


.. Boil ’■ 

■■■■■■ ; 

■ ' Iiiiect order' , 

1 ' 

■l' 

Bai 

i.'; ■■ 

: 2' 

iiple 

lio. 

' A.- 

5; 

j Total 

Fallow 

CTietdiiiie,. i . . . ... . ...: . »■. ..... 

'l!' 2 

.■-1 

1 

1 

6 


V'l 

o 

'''l 

■' ■'■'o 

2 

6 


'ThvsaiKipteraM . ... . . .. 

Other iBHeots^ uiKilashiifted . . 

, . y . . ... .. . . . ■. 

: 2 

5 

J 

:10 

: :'::0 
■ ,'/l 

■■■ 1 

■ '2 

_ 

1 

„ ■ 1 

7 

■'dl., .■V-':''' 

:■ .SO: (fattmi)' : 





FdEtiim 

i 


1 

9 

1 

0 

1 

5' ■■■■"■■ ;■ 



1 Aril'p. , . ' . . .... . - . . . .... I 


6 

. { 1 

■ .4^ 

30 

''48. ■ 


i . ....... . 

17 

'':54 

■ 16 

■ ■ 25' 

1 

j 27: 

37 ! 

■160 ' 

: ' ■ 1 


;2o 

■■ M 

15^ 

1 ; 1 

; 231 

30[ 

.'iio: 

'i 







' ■ H \)Uxl w I 5 



:'”4^ 


f 

■ SiB' {pdBture} 



' 


■OrG,liai*d . 

' Coeeklae. 

1 ' 16 

23 

■33 

I■■a3 

271 

U2 

AiitK.'.-. .V.. ............ 

■ .56 

16 

■.■28: 

1 

16 

s! 

■ 124'' 

■■■■;■■ ■ ;■ 'j 

ThT.^fUKiptera.''. . . > 

\' .2 

"■ ■' 4 

. ■■■2 


4 ! 

■ ■■'17 

V'. .1 

UiiclanBified 

\JP 

, 10 

- 1 

11 

12 

',44 



I 

■ Total ■. . 

' B4 

i5 

1 

65 


B'i7 (ofciiarilj 

i 



Insect order total 

Coccidae 143 

Ants.. 178 

Thysanoptera 184 

Unclas.sified 165 

Grand total 670 


In the compilation of these data, a count was first made of the 
total number of insects in each of the 15 soil samples. The 15 
values obtained in this way were then each split into 4 by classify- 
ing the insects observed under 4 insect orders. This subdivision 
gave a total of 60 subunits or final variates for statistical analysis. 
It should be obvious, however, that ^as orij^nally only 15 soil 
samples were taken, the maximum number of degrees of freedom 
for the comparison of the effect of soil type on the insect popula- 
tion considered as a whole cannot exceed 14. The division of the 
original whole units to subunits in accordance with the tally for 
each insect order represented does not increase the number <ff 
replicates available for the oripnal whole-unit treatment cottpari' 
sons. '•Wo estimates of the enoi varianoe have therefore to be 


'■iai’'')'' 
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<*alrula{<HL Th<* first, applies 1u Ihr whok-unit troalment factors, 
and is based on Iho dispfTsion shown by fhe original or whole 
iiriits. > The second applies, to the final treatment classes to which' 
the original variates Iiave been subdividial and re])resciit.s the 
dispersion of the subunits after due allowance has been made for 
all t!ie mc‘asurablf.5 factors affecting the data. In the complete 
;analysis:i;jf;varian.cej the' total sum of squares for the 60 final yari* 
afesdiaB'/therefore to be apportionetlto: thc‘ following components: 


^ Whole-unit S.S. comprising 


^Soil-type S.S* 

(S.S. ,with,in Biniilar samplesj i.c.,',the' 
error S.B. fra* the whole-unit treat- 
ment eoinpari8o.ns 


■Insect order S.S'- ' 

: IntcBiction: Insect order X soiltype ., 

Bkror S,S. for the subunit treatment' comparisons 


In calculating these components, the varialile-squared method 
has been used, and the respective surnB of squares have been 
axpressed in siibimit rafee.s: thrQiighoiit. 


t)() 

Total S.S. — S.S. of 60 subunit values — C.F. 

P + 2^*^ + 4 w + p + 11^ + 12^ - C.F. 

— Ilj0f0,3 with 59 degrees of freedom 

Whok’-xinit S.S,~Tliere were originally' 15 soil samples or 
whole, luuts. The required sum of squares is a measure of the 
./total, dispcMon^slmwii :by.:th#^^^ , 

:Whote--unit/'S.S.' v- :■'■ 

UP .+ 'm ^ ; - '>5S - :4~ ',6 4^'' 4- 

3j672'.8''.with:l4. degrees' of freedom / . 'ft 

The division of the sum of th<^ squared values by four is neces- 
sary, as we are working in subunit values throughout and each of 
the whole units represents the total of four Bubiuiiis. 

This wdiole-unit sum of squares represents the combined effect 
on the insect population of differences betw^een the soil types and 
the unavoidable differences in the samples from the same soil 
These two components have next to be calculated. 
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M 


Soil SB. 


30- + 323® 4- 317“ r< t- 
■ 2,804.2 with 2 degrees of freedom 


Within-soil or Error (a) S.S.— This accounts for the balance of 
the whole-unit sum of squares and number of degrees of freedom, 
equivalent to 3,672.8 - 2,804.2 = 868.6 rvith 14 -- 2 or 12 
degrce.s of freedom. It can also be ealculated directly if the 
variation in the five replica te.s from each .soil type is assessed 
indeptuidcnily, thus 

Within-soil or error («) S.S. = 

/s® + 10® -{- 3® — h "h 6'“’ 30®\ > 

\ ^ 

/42® -f 72“ + • • • 107“ 323“\ ■ ' 

, V •'."'"■“X ■; - 

/84“-f 53“4-- • • 51® 317A 

\ “4 ^ " ’’W/ 

. = 868.6 

- , , 143“ -b l78® -b 184“ -b 165® „ 

Imcciorder S.S.- ---—-^ — —.y — ^ C.F.^ 

15 , : : 

= 65.9 with 3 degree,? of freedom 

Interadim: Insect Order X Soil Type. — This is equivalent to 
the aggregate treatment sum of squares loss the .sum of .squar^lSit 
soils and insect order.? as already calculated. The aggregate, 
tnmtmcnl sum of .s<|uan>s i.s calculated from the value.? .shown by 
each mseot order in each soil type. 

Aggregate 'treatment S.S.v=*': ■ 

l24“ -b i7“-:-b'44“'V:;i^.„: 


Interaction S.S. 


Erfw(6)S,S. 


7,674.5 with 11 degrees of freeddm 

'====' 7,574.6 -, b65.9:..-b .2i804.2): 

= 4,704.4:' with: T1 - <2 -b:3)b^^o^ 

V ' ' degrees ■of:;freedom' 
total S.S. — aggregate of S-S. of, . 

component factors 

1,1070.3^ - (3,672;8: -b 65.9 -b :4,704.4)-: ^ 
2,624,2:;'with'59;'4: :(l4,;-b':3X di 

86 degrees of freedom 
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m- MBicuimEAL eeseaecm. 


In Hubtraeting the comfionerif* mnm of squares, it should bo noted 
that the whole-unit sum of squares inelurles the soil and error 
(a)' factors; tluutitoe these do not .'appear in ; this 
calculation. 


ItvB.LB : 18 .— -AN'AIA^,S:tg OP ' VAErANCB 


... V : :: : : : 


Degrees 


Irtgr of 

(by 

^ Factor - 

'. ■B.Bv" '■ 

tsf frcHt- 

.Variance'' 

■ variance* '. 

ctileii- 



doin 



laliofi) 

■Total A 

11, 070., 3 

: ' ■ 59 . ■ ^ 




Wq-jole-ioiits. . 

'■T 072.8 

■: 14 




■ Bfdl.'. . .'1 


: ;■ & ■ "i 

i . j 

: .1,4-02:1' 

4.9431 1 

1 AU'i7 

'Cd). - , : 

\ : -WSJi 

1 . m 

i' ..' 72;4 . 

! 1.07001 

j . xvii^ j: . 

■toect '.'Order. . 

'■ . 05. 9- 

; ■ ''S: 

22.0 



In teraeti'oii ■.: Insect 






oriler ''X'soi!'. . . 

4,701.d 

6 

784.1 

4.3619) 

11 Q77 

' Error '{f/).. , 

■'2,624:2-. 

' ; S6 

;■. .; .,;72.9 ' 

1.9S05f 

1 . lo* « 


* Abs is based on thfi ditE'rfnci'^ h^Hwor-n t hft JoKarithmie vainr‘ff, it simplifies the talailation 
of the Joff.arithms if tho Tariaiiees are all <iivided or mmitiplied by the power of 10, which will 
fi’s: the. decimal point of the smaOe-Ht variance one idace to the right. Thus, in this example 
the Ic^arithms quoted are for the variancesj divided by 10 converting the sinaUeat variance. 

Differences between the soils, in regard to the insect population 
a wdiole, are tested by the error (a) variance; and the other 
treatment comparisons, which are a result of the division to 
subunits, by error (b). The reading of z at the 5 per cent point 
for the soil and error (a) comparison is 0.6786. This compares 
writh a caieulaied value of z of 1.4817, and there is, therefore,' it' 
very definite significance in the insect population of the three 

■.The. € 0 mparable\ totals 'are:;;' 


:&ch.ard..:/:,::w'v:^''.t't.;t 


A difference between these totals greater than 

The fallow soH has therefore a very much smaller insect popula- 
tion than the other two types. 



■' /Proeeediiig iiow^ to' tlie ' Other treatment ' eomparisoiiSj 
iiLseidrorder variance is less than that of error (h) and obviously 
nonsignificant. The calculated value of z for the interaction is 
L1877 wliieli compares with a theoretical value at tiie 5 per cent 
point of approximately 0.44. ■ ThiS' proves that the interaction Jb 
sigBilicaut-.iind.'the t test mjiy.be used to- conipare the traatinant' 
totals' from which the interaction variance ■-■was cotculatedc' -.^The 
riaiuired' totals 'are for each insect order in.' each soil type, ' - . ' : 


.Fallow I PnHtiiro j Orrho.ni 


Uorctitluo. , . . . 
Ants. 

Thysanoptora 
UuclasBified , . 


Kacti of th(‘,se values represents the total of five subunits so 
that the staiHlard error of the difference between any pair is 
V"72.9 X 5 X 2 — 27.0. The number of degrees of freedom of 
the error (b) variance is 36, and the value of t for a probability of 
0.06 is approximately 2. A difference greater than 2 X 27.0 or 
54 is significant. In summarizing the results from such an inter- 
aclion table, it is best to consider the individual rows or individual 
columns of values in turn. From the columns, it is obvious that 
there is no difference in the numbers of each order in the fallow 
land, but there is a predominance of Thyaanoptera and unclassi- 
fied insects in the pasture land and of Coceidae and ants in the 
orchard soils. Comparison of values in the individual rows leads 
to the same conclusion. 

To the novice, this tyfje of statistical analysis may appear 
somewhat complicated. It is, however, merely a logical exten- 
sion of the technique that would have been used had no subdivi- 
sion to insect orders been possible. For example, if we ignore this 
subdivision, the analysis becomes the simple one in which the 
total sum of squares of 15 variates is split up between the treat- 
ment and error oomponents, as follows: 

Total S.S. « 8» + 10* + 3« + ■ ■ • 64* + 65* + 61* ~ 

*» 14,691.2 with 14 degrees of freedom 
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■Treatment' S.S.. 


, 3 ( 1 *, + . 323 ^:+ 317 ^.:; 


11^216.8' with 2 degrees of freedom 

'Error S.Se^^ 

= 3j 474.4 with 12 degrees of freedom 

'The sidns 'r>f ' sqiiai’-es and" the '■■correspoiidirig varianaes . are 
.■.■' exjietly' four' times thim quoted in the original’' analysis of variance 
for the whc4(Miiiilj soilj and error {a) aoiuponenls, respectively. 
'■The rahsonTor this is that; the vaiutes quoted above’are e?^pressed 
in winkle iinils, whi!<» thfK<c‘ in TaJ>le 18 are in subunits, each 
eqinvalaui to onc‘-*qufiri{^r of a- whole unit. The z and t tesis 
■ applied to either table 'wanild give ■the same ■ result. ■ , 

For i‘xaniple, using tlii! .sc‘<a>nd iuutlysis, a significant difference 
between totals for (‘uch soil would be one greater than 


” X 6 X 2 X 2T79 = 117.1 (as originally calculated). 

Thus the <‘valuaiion of the error (a) variance is an exact parallel 
of the simple analysis tated above with the various components 
quoted in smaller units. 

Table 19 records the <laily increment in diameter of two genera 
of thread blight Marasmius and Corticium of wdjich three isola- 
tions of the former and two of the latter are under observation.* 
Six platens of each isolation were prepared and daily growth 
measurements were taken over a 3«day pcaaod. It is desired to 
UBC^ these data to compare Ihc^ growth rates of the two geiiera over 
the diff('reul. days and ascertain also whether the various isolalioBR 
arc diff{U’etit fungi or merely separate cultures of one and the 

It ivill be seen that, in tins oxperimcmt, the final treatment 
' units are'T5'in:^number:asvepre by 'the totafe^of ^each : 

of six plah.^s recorded at the bottom of each (‘olumn in the first 
half of Table 19, Therefore, the treatments account for 14 
degrees of freedom. As there are 90 observations in all, the total 
sum of squares has 89 degrees of freedom and there are 89 — 14 
or 75 degrees of freedom available for the estimate of error. 
More directly, with 15 treatments and six replicates of each, the 
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Tabt..e 19.— D.ii.iET In'CBEm:e:hi\s ' OF' Tb;pj3a.d ^Beigb;ts' ix- Haef-milx.imetbe' 

;' .'UnITB 


Marasjuiijs 

isolaiiJiiiB 


Cortieinm 

tsolalioris 


totals. . . . . 


Daily totals 


iBolafioii totals 


Total for 
both pcfinurn 


MaraHinins } Cfitiiobim 


MarasmiuH Cortiolum 


witliiin-series' .or^ errar-siHn 'of' BqiiaT6S'''wiIl';h'OT^ 
flogreoB of freedom. The basie analysis is therefore: 


Treatment 'S.S. 


Error S.S. = 4,364.4 - 3,790.7 = 573.7 

The next step is to .split up the treatment sum of .squares into 
its correct components. In assessing these, it is necessary to 
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take into afcoiint tiie fact tliat the S-day period is common to all 
the ir» series and that may be an interaction between the 
time factor and the different isolations or genera. The isolations 
for each genus are entirely independent of one another; conse- 
quently there can be no interaction between genus and isolation. 
The allocation of tiie 14 degrees of freedom available for treat- 
■incnt-will' b(ra,s follows: . ; 

Genus. ...! -I 

: 'Days.:: . 2 

Marasimus.:... ....... ... ......... 2 

.Corticium, . . . . ; 1 ' 

Day X genu.s. . ... ... . . . ..... . . 2 

Day X isolations in Marasraius. . 4 

Day X isolations in Corticium . . _2 
• Total... ...... .....V..... ^ 

In calculating the rc.spective sums of .squares, the variable- 
squared method has been u.sed throughout. A .slight complica- 
tion is introduced in the,se calculations by reason of the different 
number of observations in the variou-s treatment totals. In 
evaluating certain sums of squares, this complication makes it 
neeeis.sary to divide, in turn, the sqtiare of each treatment total by 
the number of variates it repre.sents, then to sura the resultant 
value,s, and to subtract the correction factor. 


v:;:''- ' .Factor 
.Genua: 

Marasmius isolation 
Corticium isolation 


S.S. 

765"* 927= _ C692= 

574= 4- 666= 4- 552= 1,692= 

233= -4 186= 4- 346= 765= 


m 


473= + 454= 927= 


18 


36 


2,898.1 

:::’: 751 . 4 - 

10.1 


Inieradion: Day X Genus . — In this calculation, the aggregate 
treatment effect for these two factors has got to be assessed from 
the totals of each genus on each day. The interaction is this 
aggi'egate sum of squares less the components already calculated 
for the day and the genus factors independently. 
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Aggrejjalc! day X genus S.S. 

' ' , 269.“ 4- 253“ -f S 


253“ + 243“ 305“ + 313“ + 309“ 1,692“ 

Ts 1 2 ~ . ~ 90 ' 


Interaction = 2,920 — (S.S. genus + S.S. day) 
= 2,920 - (2,893.1 -j- 8.3) = 13.6 

lnterarti(m: Dnj) X Mmnsiniiis lmktlion. 

Aggregiite S.S. (Day X Mar.a.sraitis isolation totals) = 

±.lil: j- j 1 *? „ IPPl 

■ 6 ' , • ■ Si 

=: 782.3': 


Day— for Marasiniiis alone 


269“ + 253“ -I- 243“ 765“ 


18. ■ , M-'- 

Marasmiiis isolation' — 751/1 (as,. ■' 

■ ■ 'Iiiteractioii' - 782.3 '- ..(75 1.4 + 19.1)",;==- '1L8 

tni^racUon: Da/y- XCorticitmi' ImhMon, . ' ■ 


Aggrega'te S.S.' 


1652:+.1462:+ie2^-fl40M- 1672 + .147^ ^ 


Da^y^cm/Gortieinia'' alone' 


305‘’^ + 313M-309» 027^ 


.,:' 4.2'':;"," ” ^ 36 

Cortieium isolation — 10. 1 (as already calculated) 
'/^''Interae.tion'.^'s^ —.97.4',' 

■ ^ Tablb:' 20.'— Ana1jYbiB':.of ' Fariancb. 


Variance 


Troal meat: 


„■ „ . a .. 

. 2,898.1 

;v,4'’.v,:\4:'x :vv 

,8. a 

MaraBiahiH isointion 


:■ .' :'^Cdrtiemm isolation'. 

10.1 

■ lateractloai;/ '^;'':4'':' 


. , 0ay^X,''GenwB:.'.. 'i'; . '. /.’.'''V: ; 

13.6 

Bay X Marasmius iBolatson . 

11.8 

Day X Oorticium isolation . : 

. '.97.4 




Varmwces wJiich lyr® greater than the error varianw m teiffc«Mi b;^ the F test. 
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Summary of Results. — The rate of growth of Cortieium is 
(listinctly grf'aler than that of Marasmius^ tlie corresponding 
mean values being 10.6 and 7.1 millimetersj respectively. 

The low^ variance for the time factor indicates that on the 
average ■ the growth rate ■ remained ''coiistant over the. S-day 


".There Is' a marked difference, in the, growth rate 'of' the three 
■MarasmiuB.isolationSj the: mean values being ' 

, Afs — fi41 'iB.m/, 

■'■ilfs — 6,17 nun. 

■ Mi ^ 9.1,7 mm.,, ■ 

'■'The. standard' error of tlie differejice between these means 

.!> X 2 j li* 

"■ 

, 0.46 m'rii. ■ 

As the error variance is based on 75 degrf3es of freedom^ a differ- 
mice greater than two times the standard error of the differeneej 
le.j 2 X 0.46 - 0.92 millimeter is significant. Thus all three 
isolations must be regarded as different fungi. 

As the interaciion Cortieium isolation X day is significant, it 
is necessary to compare the daily totals of the C2 and C5 isolations. 
'■These'totab'are': 


Isolation (fealf-in m . ) : 


1st: day ,- 

2d day 

3d day 

165 

14R 

162 

HO 

167 

147 



A diff(‘nuir*e b(4 wotai these vaI\ieB greater than 

2 X X 6 X 2 or 19.1 fuiIf-millimeterB is rngnificant. 

For the 3-day fxiriod, there is no difference in the rate of growth of 
isolation C 2 but C& shows a definite increase in increment on the 
second day. Furthermore, C% has grown more slowly than Cz 
on the first day but more rapidly on the second day. The 
significant differences are only just significant at the 5 per cent 
point, and further experimentation wmuld be required before it 
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could be safely stated tbatdhese differeBces are'-typ^^ of.:the’"t:TO' 
isolations. 

Further elaboration here of the analysis of variance technique 
is imnecessaryj as many additional exainple^ will be found in 
Chaps. VII and VIII on field experiments. These (‘ffc3(‘tively 
illiistrate the method as apjjlied to rather more complex data. 

. ■ IXPBRIMEHTAL FRECISIOH , 

The fiuantity of information tluii may be d<,*rived from any 
'.ex|>eriiBeBt,iB inverst.4y pro];>ortioiial to its error varianeej Ic., it is 

proportional to its a term 'used, for ' 

^ t ; ■ '.error variauce'' 

ThiiSj if two ca.niiparable expeiiments yield error \wiances of' 1(1 

and 30j resjirxd ively, the infonnation aeeniini*: from the former is 

theoretically three times as nruch as from tlie latter, the f3quiv- 

alent invariaiices being jq'.o and J^'o. An 'easy iBethod of , deter- 

mining the relative precision of two experiments A and B is to 

- , X- error variance of 

calcuiate the ratio v ’—y—A ^ 

error vananee of A 

degree of precision of A relative to S. In the numerical example 

quoted above the relative precision would be or 3: 1, showing 

that the first experiment is three times as precise as the second. 

Tests of significance depend on the estimation of the appro- 

' ' ' ' ' ' ' ' 'O’ ' 

priate standard errors, obtained by calculating For any 

particular experiment, the magnitude of the standard error of any 
treatment mean varies inversely with the square root of the 
number of variates from which that mean lias been evaluated. 
To halve the significant diffenmcc or doulilo the experimenial 
precision would thf^nretically <nitail the inultiplicatioii of the 
number of replicates in each treariUK^nt by four. More generally, 
if if 18 d(3sired to reduce the signifi(*ant diffen^nce in a given 

ex})e‘rimeiit to - of its former value, the number of replicates 

would have to be increased times. On this basis, to effect a 

reduction in the significant difference from 15 to 10 per cent of 
the mean would necessitate the multiplication of the number of 

replicates by ^ or 2.25. In practice this test tends to exaggerate 


This ratio measures the 
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the number of replicates required for any stipulated increase in 
precision, as any addition to the number of replicates increases 
the number of degrees of freedom of the error variance, which in 
turn will tend to reduce the estimate of the significant difference. 
This test, therefore, errs on the safe side* 

If, from previous e:xpmm(jnts, the approximate value of the 
error variance, likely to be provided by future observations of 
the same kind, is known, it is possible to arrive at a satisfactory 
estimate of the number of replicates required for any specified 
level of prcicision. The test for a significaiit difference between 
tw^'o treatment means is based on the value of tj where 

^ _ J), ihe difference between the means 
standard error of this difference 

If is the number of replicates of each treatment and the 
anticipated error variance as shown by the analyses of variance 
of previous experiments, then 

D 


Vi 

(ir/Vn)W2'^ ^ V2 


and 


X \/2X <r y 


By substitution of the appropriate values in this equation, 
it is possible to calculate the number of replicates of each treat- 
ment necessary to prove significant any difference greater than 
D. In using the formula, it is advisable to express D and a 
as peiwntages of the genera! mean; tr then becomes the average 
coefficient of variation of previous experiments. The value of t 
is the reading from the table for any desired probability (usually 
0.05) and the number of degree.^ of freedom from which <r was 
priginaJly estimated. For example, if the expected coefficient 
of .variation is about 0 per cent, .as evaluated from 24 degrees of 
freedom, the number of replicates required to show significance 
in treatment differences exceeding 5 per cent of the general 
mean would be 

* This n must not be confuBed with the sjTnbol n in the "Mjle of t (Table 11, 
Appendix), where it represents the mimber of degrees of freedom of the 
error varianee. 



/2.064 X L4: 


14 X 01 


Thirter'n roplifates might tl'iprofore bo taken as a reasonable 
.estimate .of the nii..niber required for the speeifiecl level of preeisioii 
for future , experiments, ■ ,14 is:oaly an eHti3natej'as,;the' aeeum^ 
of tlie .test depends on, the; aceuraey with whieli the; epefiloient of 
variation van prtHletermined from pnw'ioiis researeh. Also,; 
any niarkr^d ehatige iti llie mrnilx^r of n‘p!ieates will mean that 
.the value.. of ^4 used amrreet' te the. proj) 08 ed: hew. 

experiment. . If it hair be so,fely assumed that ' tlie', niimber 'mf 
df'grees of freerlom of the error variance will exceed 30, the* 
equation , can/be considerably; si inplified l)y' using a 'YalnO' of; i 
equal .to, 2,0, The equaMon dheii; resolves .into ■ 


On I his basis, for the numerical exam|)le cited, the number of 

■ 8 X 6^ 

replicates required would = 11,52. The result agrees 

sufficiently closely with that already obtained from substitiitioiii 
in the more elaborate formula. 

Actually, it is possible by the application of these principles 
to compile tables from which—provided an estimate of the 
amount of dispersion likely to be shown by any particular 
variiibles is available— the number of replicates of 'each treat- 
maiit requiredvfor .any' ;;Bpeciied,': level, nf precision' '.can;.'’be,;.',read'^; :■ ' 
off. Suedi tables can bo very helpful in drawing up experimental 
plans, and one of the type suggested by Bird and Gutteridge^ 
has been given in the Appendix (Table VII). For any 
estimated coefficient of variaf')ility, this ta};>le records the mini- 
^iPu.m:.;n'umbqr'\ofTe^ of ■each';;treEtm6nt: series; which ;;wablci:'^' 

bci necessary to prove that any stipulated percentage difference 
between the treatment means is significant. The table is com- ” : 
■-pileci: ;yalue':, of.- J?;' ;;.i 

significance. Different values of the coefficient of variability 
are tabulated along the top of the table, and the treatment 
^.'differences, expressed as pereentagen of the mean treatment''’ 
value, are entered down the left-hand side* For the numerical 
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example already cited, reference to the table shows that, for a 
6 per cent coefficient of variability, 13 replicates of each series 
will be necessary if a 5 per cent difference between treatment 
means is to be sicinificaiit. llns was the number of replicates 
already obtained by calculation from the original formula. The 
table may also be used in the reverse direction. In an experi* 
merit in which a 9 per cent coefficient of variability is expected 
ami eight rep!icatt‘s of each treatment have been included, only 
differences between treatment means of 10 per cent or more will 
be significant. 

It is necessary to emphasijse that the table has been compiled 
from values of t applicable to data limited to only two treatment 
series. If it is cimsultf^d in connection with experiments involv- 
ing more than two tn^almcnt series, it is therefore subject to 
the limitations of accuracy already mentioned in connection 
with the original formula. In these cirimmstances, the recorded 
values will be only approximately correct, but the table will 
still serve as a rough guide in the designing of experiments 
intended to attain an3’' particular level of precision. 

DiBCussion of experimental precision wmuld not be complete 
without some mention of the advantages of comprehensive or 
relatively complex experiments, properly designed so as to permit 
of a valid analysis of variance of the data. The most obvioiis 
advantage is that, in large-scale research, the number of degrees 
of freedom associated with the error variance is high and, in 
consequence, the estimate of the standard deviation obtained 
from the data has a miidi better chance of approximating to 
the true value for tlie whole population. Secondly, a complex 
experiment including several treatment series in all combinations 
greatly widens the field of information that would be covered 
by a number of simple experiments in wdiich each treatment 
series was tested indepciKlently. Experimental results are 
considerably influenced by environmental factors. For example, 
storage problems are affected by changes in temperature, live- 
stock development by maintenance conditions, social problems 
by race and climate, and so on. In simple experimenta including 
only a single series of treatments, all the other influential agencies 
have got to be standardized as far as possible* The standards 
used are of necessity predetermined on a somewhat arbitraiy 
ba^ By superimposing several distinct series of keatments 


ANALTBm.OF fABMUaB' 


67 



m a single balanced experiiteiitj it is possibleVto ■ ascertain/ :,,not, ■ 
only tbe i')est treatment in oaeli series^ bnt tlie particular eom- 
hination of factors whic'h lead>s to the optimum result* The 
analysis of varknee teehiiique makes it possible, to ' work up ' the' 
resultant data cm an accurate statistical basis. In most research,'; 
■ there is' :an , almost endless aeries of ' combinations' . of ; treatments';^ 
; that might ■ 'be included in - each experimeht*. , . It ;is olmous^ tligt ■ 
the;, observations' ,liave' 'bt be. limited to w -munber. w,hich".ean be 
effecti.w!y 'controlled* ; The’ ,';a;moimt '.'of' ; comp^^ .advi.sable '; 
will' depend largely on tlie'experfencemfdbe staff , iu ’■charge- and-^ 
on the facilities available: for taking the records ■ and ■ carryin.g' ■ 
"owt ■ the 'Statistical , interpretation of 'results* • , In, ■■■conehisiont' 
■therefo're^ it is advisable ' to ' stress , tlie,' danger ’.of .ovemmHtioiis - 
experimentation.' Com.pk?X' experimeiite ; dO: definitely widen, 
the fiedd of information^ hut only when they are effectively 
designed and executed. ',; ■ 

V ;: IJSlFffL .FORMULAS IH' ANALYSIS -OF, VARIAHC^^ 

■ 'Let j/, any variate*;. . 

p ^ the number of Ireatments or series, 

71 — the number of variates in any one series, t*c*, the 
number of replicates. 

M — the general mean for all the n ]7 variates. 

■;;iWr/.-: any :treatihent:'mean. - 

. ’ ’ .'By, Direct Calculation, 

TotarS.S'*; 

';Treatment ;S*S. 

freedom 

Error S.S* - total S,S* — treatment S.S, with p(n — 1) 

degrees of freedom 

By ?ariable-squared Method* 

;':r;.;ikt;:fV’.-,';any,:treat^^^ iotal. 

'"''ill)!' 

,'Sy^ 'C*'F'*'.';,wit'h-ftp': 

'S;Tf ' 


witE'np l';degrees:,'0f.freedom^^^ 
'Z(Mi — Af)2 X n with p — 1 degrees of^ 


:i;,.::TQtahS.Sv 
: '.Treatment ^'SvS. 


C.F. with p 


1 degrees of freedom 
1 degrees of freedom 




treatment S*S* with p(n — 1) i 





Main ) ^ 

effects) of freedom 

S.S. - - C.R <.i,h p. - , 

Interaetion S.S. * asgregate trcato S.S. - ftre«<rtOTrt°“ 

{ps - 1) degrees of freedom 

in <«. 0, 

srr„rtt ‘ 

.reatments and the second the Series B group ^ 

I^et F« ^«yiQtegralvariatemSerie84,:f.CanywhoIeumt 

" ■ r»aS"”r'“' *“ - 


technique in agriculturae research 


If the asKn],K>d-in(?mi PK is used, the same formu las 
apply provided tlie syiohols mr taken to represent eorresponding 
'allies on tlm tahie of dilTerf'nces from the assumed mean. 

an^inebf T"' ' r ‘i-’“ ‘'omplox in character 

and include two distinct series of factors il and fi, there will be 

tI, ^■ombinalions or treatment types. 

: V:Let 2Vf : the toM^ 

raent type.'';.; ' .: 

F4 == <hm total of the variate^^ 

■ \y: ; “1 »em's /l, m /? factor. 

a = total of the variatc.s boloiiKiitg to any treatment 
m bones 7f, intspeidive of thir.! factor;^ 


G.F. 

A^egate treatment S.a 


. m 

n X p ji X Pb 
2TI 

n 


f Treatment i'- '^.S.S. 


Cd^. with pApB 1 degrecB 
of freedom 
C,F. with Pa -- 1 degrees 
of freedom 
C.P, with pB — 1 degrees 
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'>r ■ ■ m.unbt^^r- of'- whole o'Biteiti:eaeh of, these treatmeB'ts,; 

, ,. tt'Krmj.iBb(W of 

pti- the BiiiBber/of treatmeiits'io ,S<:*ries B, ie.,'the numte' 
of suhiinits to which eacii whole unit or Y is divided* 


■'CIF... 


izr 


. irX,p4 X Pb -\'[ v ;: x X^e^ 'X pir 

'L' :— ■ l^degmes.of Ireediim'; 

■11, ■■Tfed’''-wliol.e“'0'iH^^^ — :0.F. ■ degr^ea'. wf' 

e ^''■■, freedoiti ' 


Ml 

pm • 


II id, Hcnes , 4 ,. I'dogre^B.wf 

n X:pB:\ : ; ./ : 

■temloie:'' 


■Error -ta) B.S.. - l)'degriM^s,of 


111,' %ries lii -tmittaaiit S,8', 


zn 


, IV. .■' .loteraetion : serios: J'; X' B ■ 


n X Pa 
ZTf 


- C.F, with — I degrees of 
freedom 

aF, ^ llha) III! with 


y}ipB 1) degrees of freedom ’ 
"Error (t) S.S. « 1 --- (J J -f III + IV) with 

pA(n i)(ps - I) degrees of frmiam 


Significance*™!!! an analysis of variaiieej any component 
variance is significantly different from tlie error variance when 
E, the ratio of tlic larger variance to tlic smaller variance, is greater 
than the reading from the Table of F (Appendix, Table VI) for 
p 5= 0,05, The reading required m the one for values of ni and 
mi the table equivalent to the numl>er of degrees of freedom ■ 
of the lnrgc*r and the gmaller variaru^es, tesp^f lively, Alterna- 
;tiyely, 'the 'Coinp0Benl;,:.varianceBvhri^;;significpii!y^ 
thr? difference bciween h)gr of the individual vErlaneea is 
greater than the reading from the 5 percent Table ofg (Appendix^ 
Table If I) for the appropriate valuee of % and The i teat^ 
and the estimated standard errors, as a method of determining 
significant differences between treatment means or treatment 
totals, may be validly used only when the F or the ^ tet, applied : 
to the relative treatment variances/has'^venasignific&itresultf' 
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The Chi-squared (x") Test 

DiHcuBsion in ilie pn^j^^ding fdia]>tc3rs hus becnj limited to 
problems in wliich iidorniation rc^garrling one or more popula- 
tiuns is obtained by of s(‘l(H‘tiiig n^preH(‘nlative samples 

from wiiieli appropriaP* Di<‘asurfan<‘nt.s iin* taken. These ol)ser'« 
vaiions are* tlieu used Ui ]>rovide estimaUss of lantain statislics, 
whieh make it possible 10 distinguish bet\V(*en n*al and fortuitous 
diffenuica^s in llu) data ou some pnalelcrmined level of probability. 
There is another type of prol:>lem whieh frequently crops up in 
kdentific research and that is the one in which the observer 
commences with a certain hypothes is based on some .general law 
of nature or cwlwd bj? iiuhiclive reasoning. The experimental 
oafahiTfiTs case are collected in order to test whether the particu-' 
!ar material under ohscirvation comes wdthin the jurisdiction 
of the general law, or whether the preconceived hypothesis is 
in agreement with the actual facts as recorded in the experiment. 
It is not practicable to take an infinite number of variates and 
once kgain the observed data represent merely a sample of the 
whole andj in consequence, these observed values will not 
normally tally exactly wdt,h the theoretical or expected ones 
that may be deduced f©om the original hypothcBis. The question 
that at on'ce ariseB-'l8,"f'What are the limits which the deviation 
between observed and expoetod values must not exceed if it is 
to be regarded as caused by errors of random sampling and not. 
by some fundamental discrepancy between the hypothesis and 
l^he'^.'daetsf’b ■ ■ „ T'hc: . .to 

^ determine the goodness of fit between the "observed and the 
expected values 


where x = the difference between the obs^ved and expected 
values , in ;any:one'^^^ 

' . m — the expected value in any one* dlife®. . 

« *^the sum of*' for all available elasWib ' , 




GOODNESS OF FIT AND CONTINGENCY TABLES . 7-1 

Any estimate of 5=^ tlierefore based on the magnitude of 
the* difference l>idwef‘ii the obst.^rved and expoeted valuer in each 
elasH nrui on the number of classes or indopejuient comx>arisons 
avaiiahle. This latter factor nie.asures the rnun}>er of degrees 
of freedom which eon lx? (a)ri*c*ctiy attributed to the estimate 
of x^‘ The tlmcffelical dislrilmiion of been worked out^ 

anti lids (*an Ik* usikI, tni inueh the sainc principle as that described 
in cofUKKOion with the normal disiribution, to determine the 
probability of (‘xccpding any enkmiated level of purely as a 
result of the ordinary (*rrors of random sampling. Knowing' 
iIh^ theuoUical dinlnbution^ statistieians have? been able to 
rompile tables from which this probabilily can be readily deter- 
mined. hdr imy i>arlicailar number of degrees of fre(;dom n, 
fhe largiT the e*^tima1e of the gn^ater is the diserepaiicy 
beiwemi tin? obsf'rved and the exp(K*led values and the smaller 
are the chances that the hypothesis from which the expected 
values have been detc'rmined is correct. It is customary to 
ae<K.*pt a probability less than 0.05 as sufficient proof of a signifi- 
cant. iliserepaney bcdwc»en t!ie liypothesis and the observed facts, 
ajid it may be assumed that, for probabilities in excess of this, 
tliere is no reason to suspect the truth of the hypotliesis. This 
is, of course, a purely arbitrary standard which will generally, ^ 
l)ut not infallibly, provide an accurate interpretation of the 
resuItB. 

Fisherls Table of (Appendix, Table IV) gives the value of 
for selected pr 0 l)abiiities F ranging from 0.99 to 0.01 ^ , 

ilegrees of freedom n from 1 to 30. In using the 
of primary import an<?e to cxmipare the* calculated value 
wdth the table reading cinresTOuding to the correct value of 
the numl>er of degree's of freiHl®jL represented by the data. The ' ' 
required reading is tliat (corresponding to n on the table equal to : , 
the number of independent ways in which the observed value®... ^ 
may be compared witii the expected. The teat is valid only 
wdicn the individuals sampled are independent and when there ^ 
is a r easonable number of indiinduarf— say no t leas th an 
in each expected clji^ . Provided these relatively simple restric- 
tions are carefully '.^observed, there is probably less risk of a 
nonvalid nm of tip table than in fclie case of certain other sta- 
tistical tests aibject to the assumption of a normal distribution 
of tha variable 6oiiicemed. 


^ ' ' / " ' ' I 


, - 
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Consider the following very simple example. If a penny is 
to.sRe<I up a very hirge Jiiuiiber of times, one would anticipate, 
assuming that the ])enny is properly balanced, that the number 
of h{‘ads and the number of tails rocord<!d would be approxi- 
mately the same. A penny was tossed 960 times and 516 heads 
appfiared. hs tliis in agnamient with the hypothesis that the 
; pennyis not: Wased?^ ; 


(516 -- 480)2 (444 - 480)= 2{±36)2 


5.40 


The mimluT of degrees of freedom is, of counse, only 1, as when 
the number of heads has been counted, the number of tails is 
fixed by subtraetiou from IIk; total throw,s. Reference to the 
Table of x’ for w = 1 s}iow.s that equal to 5.40 eorrespond.s to 
a probability lying betwt'cn 0.05 and 0.02. (Actual readings are 
3.841 for P = 0.05 and 5.412 for P = 0.02.) Applying the 
accepted standard for a significant discrepancy {P < 0.05), 
it must therefore be a.s.sumed that the hypothe.sis that the penny 
is evenly balanced is wrong. 

BlNOMIAh DISTRIBUTION 

The first theoretical dislribution to be established by sta- 
tisticians was the binomial distribution. As the name indicates, 
this distribution is ba.srd on the binomial theorem, and before 
demonstrating its use in statistics, it is possibly advisable to 
revise very briefly the binomial expansion. The number of 
combinations of n articles taken ft at a time is given by the 
symbol ,0*. where 

* !) (?* - 2 ) (« - 3) . . . (ra - ft + 1) 


‘The binomial formula gives the expansion of expressions of 
the type (a: +^|^)", where n is an integer. 

“ , 'I* 

(« + y)" =» »" 4- «Ci 4- nCs r’^'V 4- dCi a:"-y +■■■ 

, Tbe factors from each term of the expanaoh are known 
m coe£kients. ^Ci and both reduce to 

n. so (Efficients in the above expansion axe 1, n, „<?s, 
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Example 10.— - As an cxamislf* of flioapplic.-if ion of the binomial 
tlicorem in st.atistieal work, l(;t ns consider the simple case in 
whi(;h six pennies are kissed repijatedly and the number of heads 
appearing on each <.iceasioii is iioled. After a reasonable number 
of trials, it, is possible to diwv up a frequency titblc showing the 
number of iimc.s or frctpicney 0, 1, 2, up to (i heads were obtained. 
In 960 trials the following rusuli was recorded: 


Total 900 

Mathiwnatudans have .shown that if the probability that an 
event will hiiiipen is p and that it will not happen is $ (when 
g p = 1) and if a random sample n in number is taken suffi- 
ciently often, the frequency distribution .showing the number 
of rmcasions in which the event should appear 0, 1, 2, ... » 
times in any one trial is given by the expansion of the binomial 
(? + p)"> In iho example cited, presuming that the coins are 
properly balanced, there is an equal chance of a head or a tail 
appearing at each toas, so that p and q are both equal to 
As six coins arc tossed in each trial, n, the sample or trial number, 
is six. On tlie hypothesis that there is no bias in the coins, the 
frequency di.st ribution showing the frequency with which 0, 
I, up to 6 head.s should appear will be represented by expansion 
of the? binomial 


iH + + 6 :X mrH + X (Hmiy + • • • 

6 X mny + (M)“ 

,» HHl + 6 - 1-16 + 20 - 1-16 + 6 + 1 ) 

As there are 960 trials in all, the frequency with which, 0, 1, 
2, . • . .,6 heads may be expected can be calculated by diwithag 
960 in the proportion trf the Mnomial coefficipitob^yea; la the 
parenthMes. la^sfaMe 21 these expected 



tabulated alougjside the observed, and evaluated in 

order to test whetber there is any significant difference bet\v<3en 


Table 21 — The Evaluation- of 
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111 the calculation of seven comparfeons between the 
observed and the expected frequencies are available. By the 
time the first six frequencies in each column hav(3 been entered, the 
^ last one is predetermined as the total freiiuency in each case must 
add up to 960. This means that the final value of x is also pre- 
determined by the preceding six entries. There are thus only six 
independent comparisons, and the number of degrees of freedom 
of x^f calculated, is only six. Reference to the Table of 
opposite a - 6 sIkavs that for - 16.65, P lies between 0.02 
and O.OL There is thiirefore a significant diserepancy between 
the oliserved and ex|)ected values. The most likely explanation 
V:; \this , is^that ; some -of the'pmnies are:; slightly biased. ' ^ 

In tossing an ordinary die, tlau’c is a 1 :6 cliance of a six appear- 
ing in any one throw. In tossing five dice a number of times, 
the frequency distribution showing the number of occasions 
in wlnicdi six appears 0, 1, 2, ... 5 times in any one throw 
should conform to the expansion of the binomial 

+ + - - • (H? 


'No. of 
beads 


€i 

■' 74 
2i0 
' 29f) 
252 
108 
ll 
960 


5.41 


2.84 


300 


3.24 


60 


960 


16.65 


Observed 

frocpioacy 


Expc^cted 

frequency 

(jn) 
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Tlji* s-.in,n'of the terms within tKirenthoses m 7,776, and thc^prob- • 
iibilit j of obtaining »■ mneemm^ ue^^ five Mxm in aoT one tlirow ' 

'■1 ^ 

is therefore only In' eontmst ' to 'this, . the’ probability; 

obtaiiiiiig not more than one six in. any tliro,wj,s 

;' 3,125+.A125 

■ H-'itoiM,,, 

i fo .■ t ■ ■ 


ec|iiivak?nt on the: avemgetto four out of e'vorj five: trials; 

.Tiie bi.itoraial distribution .can sonwliines be nsed’witlv advan-t 
tagt^ to . |>ro'vid(3 st'iitistit'id ■ ovi:d.e.rice ■ rtf ;the signifioanee or . oilier- ■. 
wise of n,3siiltS'-:of liii'observalioinil 'im . If thcmilianeoe that.' 
an event "wjU or will.-, not' oeeur are oq:u.al. (p m a ■ 

large iiunibti* of trials, n at a tini(% ilie numbc^r of Hiiccesses will 
be distributed in ueeordaace with the binomial expansion 


In a single* trial the probalnlity of obtaining n suceessfxs and 

. 1 1 
no failures IS v Similarly, 


"" or the Hinomial coefficients ~ 

■ ' J.'"™!*,’ 

the probabilits’" of obtaining not more than one failure is — gr"' ? 

and in general, the probability of obtaining not more than x 
failures is the sum of the first x d” 1 coefficients divided by 2". 

In a. series of: storage tests; with, grapefruit in which: half the 
fruit was wrapped iti cellophane, it wms noticed that in 16 out of 
20 Irials the amount, of pitting was obviously greater in tlu! ease 
x)f the unwrapped: fruit. • (Dan it: b(tiK 

wrapping of the fruit has been helpful in r<Hlueing the intensity 
of t he pitting? If the eello])hane lias had no efieet, in any one 
It'st. tlnj \vTap})ed and tin; unwrapijed fruit have equal chances 
of slnming a great(;r degree of pitting purely as a result of the 
unavoidable errors of random sampluig. From the binomial 
expansion (* | + it would appear that the probability of 
obtaining, purely by chance, a proportion of 16 to 4 in favor of 

the cellophane itj only — 2 ja — or ap^la* 

jaately 0.006. It can therefore be stated that the cellophane, 
has reduced the amount of damage by pitting. 



Mean differeneo 


Standard error of the mean difference 


If the yields for the corresponding half plots mte compared, 
it will be noted that in only 2 ont of the 10 plott was the weighs 


' -TmimiQURIM MMICULTUMAh MBBEABCM ' 


The binomial expansion provides a relatively simple test of 
e(‘rtain typ(*s of n^seareh data. It is particularly useful in prob- 
lems in which no numerical values are available. The arith- 
metical work involved is slight, an<l for this reason it is sometim< 3 S 
used to carry out a rapid statistical examinatio!i of bulky records. 
It is by no means so critical a test as the I test, and when the data 
permit, the i test is the better one to apply. 

Example 11.— In the following experiment involving 10 plots 
c^f lialf of each plot Avas soavii with seed whicli had been 

treated for smut Avitli forinaliip and tlie other half \A^as sown with 
untreated seed. Thi^ results are recorded in Table 22. 

Tablk 22.- '”Yn5i,nH of Maizk fkd.m Trkatuu and Pntrbatud 


Plot 

Avoragf‘ w<*ighf- of grain ppr 
half ploij kg. 

Diffnrmicc in, 
weight in same 
plot 

(T U) 

Square of 
diacrenccK 


Trcidod (f ) 

Pnt,mited (U) , 

I-. 

150 

144 

' “f* ' ' , ■ ;• 

V '0. 

36 

■"■'.'2'-, 

177 

175 


,4 


m 

150 

■ ' la:. ' 

169 

4 


' 175 

10 

100 

5 

! 139 

136 

3 


6 

I 149 

133 

16 


7 

i 201 

206 

5 


S 

I 170 

158 

12 


9 

135 

128 

7 

49 

10 

100 

161 

; 1 

■ "d-' 




+69 -6 

1;^:' :• 'A63::; 

; . 793 ' 

I;;',:; ' ^ -S;.' 
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J3.‘^-C3SNBtm ;oF' aH' Oechaeb,. 


Bhadod ITiisbadcd 




■of fJ'w graiB. srnallor in,,; 'the ease of. tlio. treated seed. ',' The prob- 
"■ ■ V| ^ . 

ability 'Cd this oceurring purely by ' chaiioe. is ' 

O.CIoo. , By the binomial iB-Cthocb-of .anolysis,/. it; ■woiild ' appear 
that the diffcreiice in favor of the treated seed is'bsrely, sigBifieant. 

As the actual weigids;ofHeedt)er half 'plot have beea'reeordc^d, 
it 'is .possitde to app.ly the i test to the data. ■ ■.Thercakulated'vahae 
■m I 'works out 'lit 3.(K ' anci rcfereiK^e to^ the Table .of f shows .that, 
for 11 degrees of 'fixfocIoTii, the. probability ’of .e^kceerimg this value" 
purely \>y t‘hance is less than 0,02, proving that the mean differ- 
ence ill' favor^ of the tivated seed, is definitely, significant.''. . In 
tin's: example -'a more' oritical '.analysis .of the data. 'has. resulted' 
front the tase' of the 1 test.. ' ... 


COHTIHGEHCY tABLlS 


Observations relative to a given population can often be 
grouped' in semiral alternative ways..', • It' then. beeo.mes . 'possible 
to draw up a eontingeiny’' table showing the proportionate 
'iiUBiber found in each 'of the 'aele-oted classes and suBelasses."'' .''■■. 

Example 12, — In an orchard of 1,000 trees a record was taken 
of the number of shaded to unshaded trees and in each of the.se 
classes the proportion of high to low^ yielding trees. The results 
are ajspended in a 2 X 2 contingency table. 


Total 


lligli'yieiders 


556 


445 


1.000 


A cursory examination of tliese figures shows that shade 
appears to favor an increase in the proportion of high yielding 
trees. It is possible to use the Table of to ascertain whetherl 
this apparent difference is purely fortuitous or whether thel 
proportions within each class are actiially influenced by the other ’ ^ 
factors concerned. This has been termed the test of ^epmdmm. ^ 
In applying this teat, it is necessary to calculate the number 'of 
.trrns in each mibolass that might be expected on the assumption 
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that the two main factors of shade intensity and yield capacity 
arc entirely indeperideni. This is achieved by dividing the total 
number of shaded ire(‘s— and then of xmshaded trees—in the 
proportion of higfi to kw yielding trees in the orchard The 
expected values an^ therefore: 

Shaded high yiclders * 600 X SBs) 

44r } ^ 0 ^ 

Sliaded low yiekkTS. 600 X ^ 2671 

'Tlnshaded lngh. yield ers, 222^ 


lTiishud(‘d low ykdders 400 X 


Another way of arriving at exactly the same result would be 
to divide the total of the high yielders and then of the low 
yielders in the proportion that the shaded and unshaded trees are 
of the total It is fairly obvious that, if a single expected value 
is calculated, the remaining three can be filled in by subtraction 
from the totals actually recorded in the respective rows and 
columns. Thus the expected value for shaded light yielders is 
600 — 333 — 267. As a single expected value determines the 
remainder, the number of degrees of freedom of will be unity. 

where m reprCvSents any expected value and x the 
difference between this and the corresponding observed value. 

Reference to the Tabk^ of x" shows that for n ^ I, the probability 
of this value of being obtained purely by cdiance lies between 
0.05 and 0.02. This provc^s that tlie shad<‘ has had a definite 
influence on tlu^ proportion of heavy to light yielders. In this 
particular orchard, shade is apparently beneficial to the trees. 
It is necessary to emphasize here tliat x® is in any ’way a 
measure of the amount of the influence of one class on another; it 
merely show^s whether the two classes are independent or not. 
Thus, if a similar experiment in another orchard had been carried 
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mil aiul tbo rakfulaUnl \alu«‘ c^f x“ (‘orrr^bpoadrd to a probability 
of D.Ol, ibis (loos not provo lliat Ibi? (‘ffrtd. of the shade in the 
seeood ordiard is. more marked than in the^lirst. ■ 

If the expected values are not otherwise 'required, i.t -is poBsibte;' 
to ealeulato '■ for a 2 X"2 contingency table . directlydrom tlar 
ecpiation '■ ■ t;, ' 

. " -4 + 

■ ’ " 0?^ ."b d'”ol)'((i .+ c)(6 -I" fi) 

where: Uy h^ e, dj.Tepresenf th.e' valines in ' the varion's sobclaHSCs as' 
anmitatfal jiv Table 23. . For t.be last exainpleh 
■; ■ , (350 X 195 -..205 X 250)(13KMb 

y,.. . 

Example 13.— In more complex cimtingenc^^^ taldcs, the eahaila- 
tion of X" bs a little more iin^olved, l)iit the loehniqnc is merely an 
extenHion of tbc principh^ descrilied for th(^ 2X2 iabie. As an 
cLxam})le of this, let it be assumed that in a sefujiid orchard the 
ela.ssifi<‘alion of the trees had been extended to include a third 
grouping ac(a:>rdiiig to three degrecss of pruning, viz,, heavy, light, 
and iinj>ruBed, aiid the results were as follows: 

Tablk 24 


High y id tiers 


:FriHii.rjg s.vs'tcm 


:| Shaded:: [ Uiis'hiidet:!' [ .Shaded ^ j IJrisimded i 


Heavy , . ' v ’ 
liglit,., 
Fiipiwied 
Teta'l . ; 


(■;y dTc>[cx|K?(d.e(I values : are ■;calcailated.(aa before i)y{ 
total: d).et ween .'its- ; three' ■siibelasse8 l.il ^ 
totals: o.f': t h.es6;..subelasses,; ic., m^'proportiorj 'to: thO' row botEl's;:-:''[" 
Thus, ill the first colunin the eiitric.\s are calculated as Mlows: 

“< - »' X iS - “•* 
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atid, Bimilarly with th 0 .; 8 UCceediBg; eolumns*,; ^ The eBtrieiS'ln the 
hiM, ruw lun! hist. i‘oluinn vim be filled in by snbtraetion from the 
eohimn and row totals,:., n^spectively, so .that .the . nnmber'. of 
degrees'of freedom is only 6. ' In general, jf.'the contingency table 
is'Camposed of t towb and c cohimmj, the .niimbcr .of, degrees .of 
freedom from wlikdi hs detemdiital will be (r — l)(c — 1). 
With' eciittplex contingency tables, ' it is adTisable to ilraw' np la 
s«‘eond lalde slanving the expected values and the share of x^: 
corresponding ' to each -(Table: 2Ji.). 

' ■ 12*811, 'whielr for 6 degncs of freedom corresponds to- a 

probability less than 0*05 and is :thei^?;fore .significant* .■Examiiia* 

tlDn.of the distribution of the values shows. that: the. high.niiTO-, 

bers aro lucabHl in the first eohiirm and particularly in the heavily 
pnnn^d, high yielding, shaded sulxdass. This eombination of 
:f actors I:ias appare.niiy increased. -the proportion of heavy yieldiiig- 
trees and is presumably the best cultural practice to follow* 

■ PROB.LEMS m .GEHETICS- ' 

. Exampie. 14.— The- ' distribution' is- .■particularly '^useful,, .-in. 
geiietieaJ' .research- m .-a .■mc3ans.. of' -testing' .whether '■■.the', recorded- 
data are or are not in agreement with some hypothesis generally 
based <Hi the Mendeliari theory. For example, in a cross between 
ivory and red snapdragons, Bauer obtained the following in the 
T'l'generation:.;'; V'- .'■'■-■t:, 

':' . -Tabob' 26.-- .-.■" •■Y: 




■OI>a(3ry.ed. ' ' Expected 


Red . . . . 

.Pink-.;:;.: 

fvary...:*^: 


It is desired to ascertain whether these figures show that: 
segregation is occumng in the simple Mendeliati ratio of 1:2:1* 
The expected values have been calculated on this bam 
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There are 2 degrees of freedom, and the Table of shows P 
to lie between 0.80 and 0.70. The hypothesis is therefore in 
* agreement with the recorded facts. 

Example 16.~In an experiment with poultry, a cross between a 
wdut<‘. rose-eombexi eoek with feathered shanks and a black single- 
combed noiifeathered hen gave the following results in the F 2 
generation:. ' ■ 


Ta.blb 27 


Pliuatitype 

No. of birds 

m 

OhscrvcHi 

Expected 

While, rosn fc‘athf*rcd . . , 

115 

108 

0 . 4.53 

tfOBjb, 

38. 

36 

0.111 

Hlac-k, rose <?onil>, huiiherwl 

Siy 

36 

0.028 

Wliite, rose; i*o}nb, iioiifefith<?rcd 

'■,'25 ' 

36 

3.360 

White, .single comb, itonfoatheml 

16 

12 

1.333 

Black, single oomb, fealhered. ........ 

' ■ 13 

12 

0.083 

Blaek, rose comb, nonfeatiiered 

10 

12 

0.333 

Black, single comb, iicmfeatliercd 

, 4 

4 

0.000 

Total - . 

256 

’“256 ’ “ 

5.701 = jc* 


The expected values have again been calculated on the assump- 
tion that each of the three characters is segregating on a simple 
3 : 1 ratio. There are 7 degrees of freedom, and P is therefore 
approximately 0.6. The results prove that the three allelo- 
morphs are inherited independently as unit characters in which 
rose comb, white color, and feat.hcred shanks are simple domi- 
nants to single comb, black color, and nonfeathored shanks. 

Example 16. — In another experiment with poultry, a cross 
betw^een Walnut- and singlocombed birds gave progeny wdth four 
'riistitet comb..phenotypeB-:(TabIe':28). 

It is obviou>s that there must be more than a single factor differ- 
ence betw‘een walnut and singles comb. The observed numbers in 
each phenotype are approximately equal and this wmuld occur 
where two factors are involved in a cross between a double hetero- 
zygote and its double recessive, has been calculated on this 
basis and corresponds to a probability slightly below 0.05. This 
proves that the data are not in agreement wdth the hypothesis, 
which may be fundamentally wrong or may merely require some 
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.tobdificafion. in "onier to , bring tilie' observed , facts in line'; mdtli the 
expeeieti ' In either case^ a more dtrt-aiJed analysk is reciuired. 

. "'b- T able .28 ^ 


PI'ieBotypo 


■ Observed. Ji a, EKpceted i©.'. 


' a.: Walaat.. 
r,;. u . 

.. Toiid, . 


^ SO 7.825; 

On Ihr^ origina.1 hypidheris, if P am! P reprosrait t,hc‘ ])eo- and 
-rose geiies'for dominafieej and 7? and r the correHpondi.hg.reeesrive' ’ 
■ genes, the cross should be of thedype , : ' 

■: Pplir K pprr : ^ :PpEr Pprr-j- 'ppRr' + pprr.:. 

: Walnut X single ■ Walnut ' ' Pea . liosa ^ ■: Siogte; ■' 

(ill npproxiinaieiy e(|ua1 inirubers) 

111. this event, the rose and- the' pea combs are simple- dominantB:: 
to single, and the walnut comb is the result of the double doinb 
: imiit iiithe'geTO''plo.sm.; ;bGn tlns:;assiiini)tion-jdt5is':pG8gibIe 
^ .the: observed tlata 'do ..aseert-ain how tbe : unit eharaeters- .are-' segre-.,^ ' 
gating, vdiieh is c^qui valent to apportioning the total among its 
■eomponents,; - 'Tl:Mrthree''avgilable'eompatisonsaire'r 5 

L P- vsvp'' .. b- ; 55- 1 

and'ltsTacaprp^^ 




T,a:BLE\ 2!l.-”'~'A'NALTS|s(be- x^: 


■■Observed ■ ISkpeeted ■ 


Total, J 


P present 
P absent 
R immnt 
R absent 

Doulde dominant or 
recessive 

(\Ei.n||l6 ■domlhanl:: 


AM'pd.W: 
m 0.10 



169 

, 160 

81 

■■■CS06f 

; 

151 

i im) . 

81 

0.506i 

185 

1 160 

■. .-.m: ' 

3 306 1 

6.61,2 

137 

160 

529 

3.3065 

.'77824' 


W' 'fc'-' 
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The total value of m the equivalent of that originally cal- 
culated. There arc Z degrees of freedom in all^ one for each of 
the components. The respective probabilities read from the 
Table of approximately: 

Class 1 0.60 

::' Ciaas : iL. . / . 

.c. ; 0.01 ■ ; 

The pea- and the rose-comb factors are therefore vsegregating in 
aewdanee with expectation, but the third class is not. There is 
a marked preponderance of the double dominant and double 
recessive phimotypes. This indicates that the combination of 
PR and pr occurs more frequently irhan Pr or pR and leads to the 
conclusion that linkage between the pea- and the rose-comb 
:'factorB. exists. V - 

Example 17, — In a cross between purple-sweet and white- 
starchy corn, the following proportional frequencies (after East 
and Hayes) from 11 different plants were noted in the 
generation. The expected values have been calculated on the 
9:3: :3: 1 ratio of a simple dihybrid, 

■TaBX^E ,BO.*--DiSTEIBTOPM .m PHBK‘0'rTrB5S"',m- THI' Fa ' OnNEEATION ^.'aF;, A;. 

Maize Cross 


Phenotypes 


Total 

ob- 

served 


Purple starchy Purple sweet W'hite starchy White sweet 


Plaat no. 


Ob- Ex- Ob- Ex- Ob- Ex- Ob- F.x- 

served peeted served peeted served pected served i^ected 


(mmms AMB'. mMfimMwcf- tables: ::i5. 




' ' The/:total» of tlia |>honotypei«''i;ije obviously 
wltlifliO' ox|>a€tod;9;^3r:^ ratio/ ,/For these'totiilB,: ' 


'X 


2 


IP P , 242'.' .'3^ 
\m 303 303 kii 


2.418' 


.■ For tlie available S dx^grees of freedom, 4lie' value ofP is .approxi- 
niiUely 0,raij bhmvmg wuisfaidory agreeiut’ul br4weeii data ami 

' It. h jSigaiii possi.I;jlc* to f‘lTrHd;. a more eritiefil aua-lysls l)y resolvbig 
into its eouiiwrient parts. The two ,elassGSj purple, to white 
graiiij Eiol-starcliy to sweet* shouki each show the. uoriaal A: t ratio 
riiid ■ prrwicic tlie- first two compoueiits.of l'lKefirial.. eoiu|KH' 
■ueiit.or which tests tb.e third' way iu whie,h tlM3,.g(mes,reeom-: 
biue is not obvious from the data.' ■ In 'the phenotypes :'it, is not 
pfjsHible to .separate the lioBioz^ygous doniinaiit .from' the., lietero-^ 
^3'gou.s. It may he assessed 'by' subtracting the aggregate^ of 'the 
first two 'Components fr-oxn the- total x^'^b ori.gi'na]ly:ealeulated." ' ' 


TABnE''3i.--~“AN'AnTsi.s of 


yCl^aS:; 

Type 

Hbsorved ^ 

. Expe'cteti 

m 

h . ■ .^’" 

i!! x*,;:: 



l.,233 -'j 

' 1.232 '. 


1 Anti 


■'White.. .... . 

, , 388 

-,404;. 



■■ 


■ i',2C)o; 

1/2I'2:.' ■ 

ro.iiet s'! 



1 Bwpfit k'/,'. . , .. ./. /. /.. . .. .. , 

',.'. 4 1.'6 

'.404 


U - 

: ..I'll" 

«2,4tS -- 11.455 4-0.470* 



0'..48?;/ 










The P value for imy of the com|>onents of hos between 0,50' 
and OJI), proving that purple to white grains and starchy to 
■Bweet .are: segra^ting on ■ a 'simple ''S': '4 ''ratid/without'T^ 

An afterimtive way of computing these three components of 
comes from the use of the following mathematical- exprasBions 
given by Fisher. If a, &, c, and d represent the numbers in the ^ 
four phenotypes arranged in order according to the 9:3: :8 : 1 ratio 
and n represents the total number of observations, then the 
values of x® ^re equivalent to 
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For example, 

, „ (1,233 - 3 X 383)=* , 

Glass I, =1.455 

, (1,200 - 1,248)2 _n4-R 

aassm,x-^-5^±^ 

Total = 2.418 

The reHiilt8 thi^rcfore agree with those calculated by the other 
methods. 

Analysis of the data can bo carried out in still greater detail, 
adding to the accuracy of the final conclusions. The records 
show, not only the totals of each phenotype, but also the actual 
numbers of each phenotype obtained from 11 separate plants. 
Each family should according to hypothesis split up in the 
9:3::3:1 ratio, making it possible to calculate a total with 
11 X 3 or 33 degrees of freedom and then decide whether the 
general conclusion based on the whole of the available data 
applies equally effectively to all the individual units concerned. 
Furthermore, by breaking up the total x" each family into its 
three components by the method already applied to the totals of 
the pliOBoiypes, a test of the way in which each family is segregate 
ing is provided. For the first family 

Resoiving, this into its components, 

■Giassti, 

Class in, X* * ^ 0.091 

Total = 0.243 
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♦^iinilar ealeiiltitioaf^, upi'^lied U> the TOuaining 10 fainilk^s, 
pnnicka! the full x" anaJyi^s of T'aWf? 32. ' . . 


Ta.bli 2- 32.— 'DF/rAiLED Analysis of' x® 



• 

■ 

CoiripOIlt^Titri 'of' X®' 


‘ ■ 

■ 

Fam-ilv 

I V 1 .11; 'i 

■(iMirph? v$. white) 'j (stur«hy. Y«,: swaet)' 

'■'.' ,II|, , 

, X®' for 0iit4i , 

,; ■ family 

'■ ■'■■ , ■ ■ ■' '■ ■ , 

J- ■' 

' Alvl21 

' ' ■! ' CMB2 ■ ■ 

■0JM)" 

""0.243 

: 2 . 

0.334 

■: 1 ' ,■■0.037 ,' 

o.ww 


3 ■' 

■0.fTO 

. ,1 , • 0.1,22' 

0.363 

0.485 ■■: 

.4 

, 1.042 

i ’L042 

0.057 

• 2.J,41' -" V 

tdi 

■',0,037 

1 0.1,48 


' ^ 0.79iV'- 

(>■: ^ 

OJK'tS 

i 1.5(.K,) 

(K5(K.J 

'2.0§8','. 

7 

■ , O.lll 

1 2.2.52 

0.232 i 

2.595:'' ■ 

V B 

0.3S4' 

j ' ■ '0.037 

0.049 , 

■ ' 0.420 ,. . 

■ 9 ■ ^ 

' 0.334 

1 0-037 

o,.ooo,' 

' ■ 0.371". ■ 

: Id 1 

1.20f) 

1 0.300 

,0,278 
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Kaf‘h of iiiose eomponoiits utilizes 1 degree of freedom, so that 
the total 21.5o2j has 33 d<^grees of freedom. The Table of 

X^ cifHi's mii rt^eord probabilities for degrees of freedom n exceeding 
30. Wfien n is greater than 30, the distrilnition of vT® hs approx- 
imately normal, and the x of tlie normal distribution may be 
taken .as ii lualeat in.^ ■numerieal. ; value, irn^speeti ve: of sign, : ■ tO: ; 

■x/Sx" : L ■ : This expression' 'm used to ' eyakjate " 

^tli6''diifa> :and:fhe 'probability;':tbat\x^:iH' in 
'aBeertmiHHi:froin/Alie;;Talde;;of:.^.;'::;:::Thc!':Iarg^^ 
diundter ;: 0 faiegreeB''.pf: f reedoiii pf ::x^r'T}nt niorcv^eeu 

■ boeonfe:.;:. lir the^bfet example, 


\/2x* V2«. ~ i = V2 X 21.552 - V2 X 33“- 1 - 1.497 


Frc»m 1h«> Tabic* of x,F~ 0.14, approxiraatoly, proving that the 
hy|x>tlieRi.s and t he data as a a-hofe are in agreement; 

Tice (ioliiran totals, each having 11 degrees of freedom, show 
that segregation of the unit characters is occurring according to 
expectation in the 3:1 ratio for purple to white and starchy to 
sweet. The totals for the indi^^d^al families prove that the first 
10 have behaved strictly according to expectation. In the last 
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family, however, the expected 3 : 1 ratios do not occur, the prob- 
ability value for this component (9.689) being less than 0.05, 
This does not upset the general hypothesis. The family in 
question may have be(‘n subjected to some peculiar influence, 
c.g., insect or fungal attack affecting seed formation. That some 
abnormality occurred is suggested by the fact that the total 
number of grains recorded for this plant is very much less than 
the average for the othr?r 10. Actually, when a critical proba- 
bility of 0.05 is being used, a single deviation as large as that 
shown by family 11 is to be expected, at least on(‘e, in a frequency 
array showing the contributions to x" for each degree of freedom 
out of a total of 33. 

Thus, t!ie complete analysis does make it possible to give a 
more critictil interpretation of the data, as it not only takes into 
consid(‘ration thc». general results btit also tracCvS to its vSource any 
deviation from normal among the various components from 
wiiich the total is determined. 

Another distribution to which the test applies is thePo?'6son 
series. Like the binomial, it is an example of a discrete di.stribu- 
tion, in which entries generally occur in the form of integers, and 
the range of possible values is limited. Therefore, the Poisson series 
contrasts with the normal distribution which theoretically may 
include any intermc^diate value from — co to + » In research 
work, the use of the Poisson distribution is limited to certain spe- 
cialized problems, in which P, the probability of the event occur- 
ring, is very small Its application presupposes the recognition of 
data which can correctly be classified as belonging to the Poisson 
aerias. Yule points out that the advanced student may find this 
distribution of considerable theoretical interest, but further dis- 
cussion here would definitely be out of place. This brief explana- 
tory note has been included only because the student will find 
this distribution discuased in more technical works on statistics. 
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ClIAPTEE IV 
:■ mAGRAMS' 

B(*fore the prmmi rc‘girnf^ In wliic'li tbo worker is 

1o givi^ irudlKSuafieol pnNif — or ils <*(|uiv{ilf*ni— of the 
nmimey of bi.s ecmf^IuVionH, a diagrarrnnatie presentation of the 
dat^a was one of tlie ehic-f means nned to iiiter|)n-*i^ n^siiiis. With 
the rr'*ec*nt aflvam*es i.n, atati.siieai tetPrji(|m% tlicTcj has been a 
i<aHJenc\Y tcj regard fj-ie diagram as an obsolete method which 
mathematical treatment has rendered no longer necessary, 
%vhcreas, in fact, the two metl’ioi'Is are supplementary. Efficient 
statist ies supply adecpiatc? evidence that the conclusions are valid 
and not basc^d merely on appansit diffensHfc^s in the data due 
entirely to chance variation outside the control of the operator. 
Diagrams record tlio data in an easily assimilated form and make 
it possible t :0 obtain a clear grasp of thfj facts that the mathe- 
matics have proved to bo correct. For reference purposes, 
diagrams are particularly useful, as they demonstrate at a glance 
the salient features of the rcBults of previous experiments and 
show up points of reKeinblan(‘e and differeiiee between these .and 
the current y<'ar^s data. Furthermore, th<iy will often indicate 
certain features, sometimcB of fundamental in^portanco, that have 
' i>ecai entirely overlooked in the statistical analysiB. Statistical 
■■ elaboration k only effective wlicre the data are sufficient to yield a 
competent estimate of tin? standard de\uatioiL The design of an 
experiment, especially in new !ine.H of rr3Hearch, may be such that 
mathematical proof of certain apparent differences is quite 
impossible. The diagrams elmuld show which of these are likely 
to be important and guide the research worker in the planning of 
later experiments so as to obtain sufficient data for a statistical 
examination of these characters. Treatment effects diseovemd 
by means of a diagrammatic presentation of the data should, 
wherever poasible, be supplemented by mathematical proof. 
When this is forthcoming, the results can be safely regarded aa ■ 
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The firBt eBscntial of a good diagram is l\icidity; the important 
features should stand out boldly, so that, merely by permsal of the 
caption, the observer can not only comprehend what the diagram 
purports to represent but also interpret for himself the significant 
features. In this direction the skillful use of colors can often be 
very effective, but even with plain black-and-white diagrams, a 
certain amount of care in delineation will generally suffice to 
emphasize the recpiired points. The commonest mistake is the 
inclusion of too many contrasting and possibly interacting factors 
in a single diagram, which instead of clarifying the issuii only 
leads to cjonfusioii. Tlie obvious remedy is to spread the data 
over two or even three separate diagrams, possibly on a reduced 
scale. By suitable^ subdivision of the data over the various 
diagrams, it is ]>ossible to emj)hasizo the particular relations 
between the various factors that are considered to be of greatest 
importance. 

■OltAPHS 

The most commonly adopted type of diagram is the graph, 
which in its simplest form shows the behavior of a given character 
in relation to tw^o contrasting factors plotted on squared paper 
along axes at right angles. In such a graph, the choice of a 
suitable scale for each axis is important, and as a general rule it is 
advivsable to aim at obtaining a curve which is located somewhere 
in the vicinity of the diagonal between the axes. Any change in 
slope will tend to be accentuated if this plan is followed. Con- 
sider Fig. 2/1 in which the effects of varying the cutting rotation 
on the yield of herbage over a i 2-month period are shown. It is 
obvious that there is a marked increase in total yield from serie.s 
A to D ; also that for all four treatments the rate of increment tends 
to decrease after November or December. In Fig. 2B, so many 
factors have been su|)orirnposed that a very critical examination 
is required before the significant features can be determined. In 
neither of these diagrams has an attempt been made to level out 
the variation between individual readings by tracing a smooth 
curve more or less arbitrarily between the plotted points. 
Adjacent readings are joined by a straight line. This has the 
advantage of eliminating the human factor in plotting the final 
curve, shows the actual values from which it was determined, and 
may even make it apparent that certain fluctuations are not 
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fortuitous Imt rath«'r the rosiili: of some cxterual agoncy, A; 
uscftil alternative to the line graplv is the cohunnar one iri wMcll 




'6 weeks (M 


C r o p s 

■Fie. . 2i?.--":Gra.p.h ol. mi^.f^eggiir^'ctrop yields^' ■froia:iiiiJeraii.t .. farfciliU'eiis. 


the recorded values are each represented by an area in the form 
of a nicfangle whose sides are parallel to the axes of the graph. 
This type of graph is often preferred where the vertical aas 


Ayg. Sept. Oci; Nov, Dec, Jao. Feb. ?^ar ■Apr, ■May . June ,, July . ■ Aug. ... 

L4.— Yieldf^ of horhage' from four l:i.ar\'estiiig rotatioiis; owr" a' 

period' of tweh-e 
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shows some quantitative return relative to some time interval as 
pk>i.tecl along the luuizontal axis. Figiu’*e 3 /i is an example iii 
which both methods of presentation have been used effectively. 
The rainfiill is portrayed in the columnar form, the final figure 
being a histogram. The total rainfall is proportional to the area 
of the histogram. Comparison of the two graphs shows that on 
the average an increase in the rainfall coincides with a drop in the 
dry-matter percentage. 11118 indicates an apparent negative 
correlation Vietwecm rainfall and dry-matter percentage. The 


DRY MATTER 
IN 

HERBAGE 


RAINFALL 


AUy- OCT. UcC. rcD. Apr. ilUne AM9'‘ 

Fra. SA, — Dry-matter percentages and rainfall in inches for period August, 
1935, to August, 1936. 


dry-matter figures are too few to allow the value of the correlation 
coefficient to be calculated with any accuracy, and additional 
records would be required if an estimate of this coefficient was 
considered essential. 

An attempt is sometimes made to demonstrate in a single 
figure exactly how a given character reacts to changes in three 
external agencies. With certain types of data, this can bo 
achieved by dividing the columns of the histogram transversely 
in proportion to the effects of the third factor under consideration. 
An alternative is to build a solid model of rectangular blocks, so 
as to depict changes in the interacting factors along three 
dimensions at right angles. Figure 3B shows a model of this type 
in which the effect of spacing, sowing date, and quantity of 
ferfffinBer on the yield of cotton is depicted. It is obvious that the 
Qplimuiix date of sowing lies in August. With early sowing there 
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IS little difference in yield as a rcsult/of the pdrttcular 
used, bnt in the hiter sown plots wide spacing fe apparently 
prcferalilo. The heavier dressings of mamire definitely increase 
yields if applied in July or August but iiave no advantage over the 
control if the fertilizer is broadcast too late in the .season. This 
Micthojl of presentation is certainly stiectaeular and eff<;;cfive when' 
the n:iode]. eair be insp<?cted. : it has the disadvantage that the 
figui e 1.^ pitlii: r Itiboi lolls to const ructij and iinie.s.s the interaction 

.■ffeetsare very marked, it kof little nse for reproductiem ill print. 



^ In graphs: in which growth measuremfmla-^w’e : height, 

girth, ;spp;ad----are plotted against tinic intervals, there^^^OT^ 

ataimattve; hiethods t>f ■ presenftitiOT.;' ' The; growth ds: norinalljr 

measured 'alonlgtheyertioaiMikand the tihie' the^hofkipfttai 

one, 'Tiie .yerfical 'Scaie. may khow:'^:: ^ ;■ 

;.;;fly'Fbe^ actua!'.mea8ureraent.; 

;l>. ;.|The;^apieriah ;Iogarithrn:^^ jhe' meaauremehl' (l0g,),d::;- "c:;: 

v;; ct,;'The:inerementtfrom.dhe piwious:me!k«remePti'::t 
: d. The reiative inerenjerit from ilie ;preyioupi;;pe^ureifient^ 

The increment c, or more accurately the absolute mcrement, is 
assessed by subtraction of the reading at any period from the 
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succeeding one. The rate of increment is estimated by dividing 
this figure by the number of units of time between the two 
readings. Where readings are taken at regular intervals^ the 
period between successive readings may be made the unit of time ; 
then the rate of increment is equal to the increment, and the 
division bjT' the number of time units becomes superfluous. The 
rate of increment represents the avc^rage rate during any particu- 
lar time interval and should be plotted against the mid-point of 
tMs lime period on the iiorisscmtal sc^^^ 


Tab^e of a GnowiNa Pm 


Age of 
im : 

%vf‘oks 1 

Weight, 

; kg. 

■ Iiicro- ■ 
ment, 
kg.- 

4'late .of ■ 
ill ere- 
merit, 

kg. per 
week 

.Napieria.!! 
logarithm 
of w eigh t 

Relative 
incre- 
ment, % 
of weight 

Relative 
iiicre- 
ment, - % 
per week 

4 

. ' 8. : 
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22 
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31 
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: ; ■ ' ■ 1 

; 2.^5 : ■ ■ 

' 3.4340 
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■ ■727.90'^ 
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■ ■. ■! 

41 
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5 . 45 

•24 1 
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17.44 
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; ,v'' 

'■■■; 4,3175 ::■ 
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4.00 

36 




!;::4;:4774■•.■ 
1, ; ■ , ■■ 

14.76 

3.69 

40 

\ ■■■ ■ ■ ' ■/ 



^ 4.6250 




The relative increment takes into account not only the time 
factor but also the size of the individual for which each incrc^ase is 
recorded. Thus, an increment of 10 feet in the height of a tree 
originally 20 feet high is obviously less than one of 15 feet in a 
second tree, but, if the second tree was 50 feet high, its relative 
increment is actually only about half that of the small tree. The 
relative increment is measured by the difference between the 
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the former two methods of graphical representation will provide a 
better indication, of the nature of the growth. The increrueiitt 
graphs may add to the information. B\>r example, in this figure, 
line c appea.rg to be linear in form, proving that the increment is 
increasing more or less in direct proportion with the age. The 
relative increment, Iiowever, approximates to a falling curve 
which is gradually flattening out. The drop in relative incre- 
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~”Incr<?ment curves from a yoting pig. 


xxient is much more marked in the first 5 months than in the 
second. Assuming that expenses in the form of rations, etc., 
increase roughly in proportion to the age of the animal, the 
optimum time to sell would be when this curve of relative incre-- 
ment tends to fall away more steeply for the second time. This 
stage has not yet been reached with this particular pig. 

FREQUENCY DISTRIBUTION B 

Another diagram of rather a different character is the one 
obtained by plotting a frequency distribution, of which a simple 
example has already been given in Tig. 1. As a more adequate 
illustration of this type of diagram, the data from Table 34 have 
been used to plot Fig. BA. These data represent the frequency 
distribution of length measurements of 1,000 cacao beans 
arranged in millimeter classes. 
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•I^IIEQIfENCy 1'A-B;LE ; FflE' tEHEI’fl OF;' CacAE; 

. . Beaks . IK I“MM. .Ceabseb, ; 

,)f beari'^^ ■ .No. of beans of oiieli, lengtli 

m, freqneiiey 


The resultant frequency polygon (Fig. fn4) exemplifies many 
of the features charaeteristlc of cliagraniH of this type. The peak, 
of the polygon, tlie class containing the largest number of 
individuafe is termed the mode of the cur%^e. This must be 
distinguished from the mean or arithmetic average of all the 
readings. In the normal curve which fa. symmetrical, 'the 
mode and the mean coincide, but the polygon in Fig. &A is 
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slightly skew and the two ordinates are quite separate, represent- 
ing values of 21 and 22 millimeters, respectively. 

Another very olDvious feature of this frequency distribution 
for cacao beans is that individuals showing extreme deviations 
from the mean, in either a positive or negative direction, are 
comparatively rare, but as the clavSvS values approach the mean, 
the number of individuals recorded in each class tends to get 
progressively greater. Actually, if the number of variates is 
reasonably large, most curves of this type conform roughly 



to the expansion of the binomial (a + 6)^, where a and b are unity 
and n is the number of classes into which the data have been 
grouped. When an infinite number of continuous variates are 
taken and the unit of measurement is made infinitely small, 
the normal curve, on which many statistical tests of significance 
depend, will ultimately be reached. 

In Table 34, the length of each cacao bean has been recorded 
to the nearest millimeter. The figures quoted for the length 
of beans in the first column represe#: the mean of each class 
and cover a range of sizes ±0.5 millimeter from the recorded 
value. Thus, the two beans shown to be 13 millimeters in length 
may actually measure any length between 12.5 and 13.6 milli- 
meters. This raises the question of the correct allocation of 
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teariB wlK)se ^ dze m exac'tly ■' Biiclway ^ ■ betweeB^ .the ::meana of 'two' 
Slioohi: a' bean- exaeily IJJ niillimoters be 

indoded iii 'the 13 or ibe ,14 DdilkBoter ela^s? ■ ' Mathematicdaiis 
stipulate that where thi^/oceiirs frecpieiicy .of one-half ' Bhoiild 
be indnded in eu(*h of the two elassos. jVoviclr^d ihe niethoci 
cif ineasiiremeut m anilcientlj^'aeenmtej the- iniinber of mclividiiaM 
showinfj^ vtdiWiS extuity nihlwaj ladween ela>?ses {?;h,o«hl la*, 
rclallvely small. /-Half' freq-ueiKies- 'niay’.'not: mm ' appear iii /tlie 
b.nai' .'fretpienc'y table^- m ^ aii' even niiiiiber,: of half ■ f req'iifi;TieieS'’lii 
aiiyelashr would make. the. total frequency of that; cl ass an, integer , 
III I'aWe 34 the lengtli of the I>i;‘a;ns' varies lietwecai. 13 iinci '32.': 
Ill 'many experiiBentBj; the range ■ between, tlio higliest:; .ami.the 
'iowcBt value; is mucl'i 'greater than thisj and it 'naty ' bc' fid'Vi.sahle 
to reduce the' number, of clasHcs by making each one. cover ' a' 
wider range of iBC'^asureinents, ■ As- an iHustratioii of ■. liow.t Iris' ■ 
may be* don<a tlie dat.a for the length of beans have been 
grouped, below’' into .7 instead' of 21 '"Classes. ■ Each nmv class ■ 
inchules all values ranging ±L5 luillimetm’s from the mean of 
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the ela8s. The class mtervalj the difference between the mean 
values of successive elasses, is now 3 millimeters, or three tinies 
as great as that originally used. The method in which the 
recorded values have been regrouped is also indicated in Table 34, 
The data from this second frequency table have been plotted 
in Fig. 5i5. This illustrates the general rule that coarser group- 



classes. 

ing normally gives a much more regular type of frequency curve, 
because the uncontrollable variation between adjacent fre- 
quencies tends to be leveled out. In this graph, the mode and 
the mean actually coincide. There is, of course, a limit beyond 
which an increase in the size of the class interval is likely to lead 
to a loss in accuracy, especially if it is intended to use the fre- 
quency table to calculate the standard deviation of the variates. 
In this example, the number of classes has been reduced below 
the acceptable minimum; seven readings will rarely sufBee to 
fix a frequency curve with any accuracy. If the class interval 
is not too large, the loss of accuracy caused by grouping is 
negligible. This rule holds good, provided the class interval 
does not exceed one-quarter of the value of the standard devia- 
tion. In this experiment, as we shall see, the class interval is 
actually greater than the standard deviation, proving that the 
grouping is too coarse. Statistical calculations based on such a 
fraquaney table would therefore be open to the criticism that an 
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■additicioal lywi a:VBicIiiJ,)Ie gTOiiiiiag yrror; has heen' added^ to': the: 
liio 'error 


EVALUATION OF STANMEB DEVIATIOH FEOM A FEEQUINCY 

TABLE 

Where ■ tlie', data .arc? cxteasivej. the grouping art ' the 'variaios 
Into a :frei|ue;oey taliie:.. greatly 'redw*es tl'uv routine' arithiuetie 
ii,c3e(:^«saTy^ to ,a 8tatl,stical analysis. ■. ' This' rehitively Biiiiple' table 
lias 'used 'as a lacauis of ''dcaBOiiatrati.iig ' ttie various,: ways in 

:*l«l--:---''C::^A:f,,e'CTLA'iafia or Hi’^kuakp Oia'j'AaiOM . C.h*Ass''TNTi!iiVAa8 


■elass laitTvals', 


(iH elasB Hitervals) - yg[f§ uiterrals - 0.844 class 

Ip-terTOl 

2,58 /mill. '(iiB’prigliiaHy caletdated)'''.'::^ 

Similarly, the S,S, 713, in elasn intervaLs 

— 6,417 mm, (as origimilly ealculatfjd) 


wluch the , standard deviation can 'be estimated from data 
arranged in the form of a frequency table. The direct calculation 
of the sum of squares of the deviations from the mean is shown 
in the Bc^eond half of Table 35. In calculating M and cr, it must 
be remembered that two factors have got to be considered, the 
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Thus, 


2(/ X m) 

: n " 


and 


i f Xim - Am 
n -- 1 ■ ■ 


The column recording the deviations of the class means from 
the general iBean (m — M) will always be in arithmetical pro- 
gression, and this makes it po.ssibl(3 to work throughout in class 

. TAB'rE, •OAi.ctmATiOi^ of Standaei) Deviation by Assumed-mean: ani> 
Vabiable-squakep Methods 
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intervals instead of in actual units of measurement (Table 36), 
This is particularly useful when the class interval is not an integer. 
When the class interval becomes the unit, the deviations will be 
in the form of a regular sequence of positive or negative numbers, 
1, 2, 3, 4, etc., on either side of the mean class with jsero deviation. 
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dev iations^ in class interval units' represent, tlie ■true' 'devia*^ ■' 
iioii; divided ■l>y the 

It is pushible to use either the assuined-iuean or the variable- 
squared nietliod for estiiaatirig tlie standard deviation, from a,. 
fre-qne.!H.'*y' table.' : The fcir,mer.is proljably t.lnt best 'gcrneral' .ntili1,'y', 
inet:liod,j as it eliniiiiates the 'need of eaIculat.ing’'S(/ X 'to^esti-; 

the genera! inean and avedds -tlie raliier euinlTOUBjiiiiltiidb; 
cations .of . tile varkbl.e-sqiuirf;‘d .systeiin' ' The riiitdica/tioir 
inetluKls;: to the same 'data is.-. shown 'in Table 37,.'aud\k;BeIb' 
ex|:danatoryo' 

' COREELATION -OIAGEAMS : 

■ When,^ several factors are’ boijig ^testcah it 'is often; possible, 'to 
use the.' same; series of indivichialS' to provide ineasuremeiits for 
tw^o or 'more different characters.' These .characters 'may 'be' 
interdependent to a greater or less ejcteiitj'aii irlteration' in one'' 
tending to produce' some-' corresponding change in, the other. 'In 
physics 'and ebemistryj the rcIatioiiship;is often so complete' that; 
a change in one factor prcahux^s aii 'exactly 'proportionate;' change'’ 
in flic sf^eond- In most biological ]>robleins, the affinity is much 
less e\ idtud., but it is possible 1o uldain some idea of ilie general 
■iiatiin:? of' thC' dissociation.’ by '.'plotting'a iot tiimjmm, .':';.lB.’.'''makif:ig'''^ 
a dofc:,diagram,;the''Cl'niracters are/not; plotted.,against ;chan^^ 
,S0nie; varialde 'exteriia'l ' factor: such ; as''''time/Mt:ervals^'T^^^^ 
values'/recorded'for'OBO character ’:asvneasiiredalong,tbe; abscissa;' 
are; j:dotted.,agaiiisttthfi:':coites|Kn;iding'': reacli'Ugs ' 
cha racier along an ordinate at rigid, angles. Each plotted point 
is'.'|ocatecl;by tl:Hr'Co0nJi'n^^^^ ■of'thct'tvro ;cliarO:CterS':tormite 
viduaL t; ; It vis , thcreforedissimtial' .; to ''':kte 
v:iduals';: :f'' 0 '; which'v:; each :;|m.iticu!ar^^.:h'kik^ 'If;: 

■eliajigc/ iri/bnEvCharacte^ 

: second j:':,lbe ';.plcdted:::p0in te V will be. ■' 

variation in one <*haraetc?r has no infliicnc.e on the readings 
reeordedTpr. the 'second', ' t^^^ do,te 'tvill'';;be''Bcttterad'Irregu 
over;''The\'tiiagram..,; '^'' Depehchn^ '.'the'' ''degree 
■l>etween the two factors, the arrangement of the dots may be 
anything between these extremes. Figure 6 has been plotted;; 
from the data in Table 38 recording the rainfall and mean yield 
of maize over a 25-year period. It is very obvious from the 
scatter of the 25 dots that high yields are on the average asao- ^ 
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dated with seasons of high rainfall In interpreting 
diagram, it is generally advisable to divide it into four quadrants 
by drawing the axes intersecting the scales at the mean values 
of their respective factors. These quadrants have been num- 
bered I to IV in sequence. Practically all the dots lie in the first 
and third quadrants; this is typical of data in which high vahies 
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Fia. 6. — Dot diagram for rainfall and yield of malzo over a 26-year period. 

in the one factor tend to correspond to high values in the second. 
In other words, there is a poniive correlation between rainfall 
and yield. Since the dots cluster fairly closely round the median 
line as plotted arbitrarily, it can be assumed that the correlation 
h high and that rainfall and yield are closely linked. An arrange- 
ment of dots similar to the above but located in the second and 
fo'oirth quadrants would indicate the same degree of association 

, .. •, ■ .■ . •• V ■. . : . . „ 


between., the . two ^ fiiet:ors^ of: liiejippo^^ite 'sign in wliieh-an 

iiwrease in the; values (if o!ie t-l)!iracter teiuis to be linked avith a 
deereast; in those ol the sejcoiid. Such an arrangeihent would 
show A myatim correlation. If the dots lie rnore or less evenly 
in all four ejuadrante, it can be assumtjd that thcjre i.s no eorrela- 
tioa betwajen the tAVo (dianicte^^ 




CHAPTER V 


CORRELATION 

Soientifie ro.s(‘arch geiiorally entails the consideration of a 
number of int('rar4ing factors, and it is often of primary impor- 
tance to know exactly the extent to which thes(3 various factors 
influence one another. The comdation or degree of associa- 
tion (!an be measured mathematically by calculating the currc- 
lation cofJJieieriL In estimating this, a ttible should be drawn 
up to show, for any recorded value of one factor, fhf3 correspond- 
ing value of the second. Table 38 is of this type and records, 
for each year from 1883 to 1907, the nu^an yield of maize in Ohio 
and the corresponding rainfall for the crop season. These 
data have been used to exemplify the computation of a simple 
correlation coefficient. 

CALCULATION OF A CORRELATION COEFFICIENT 

Example 18. — The entries in the last column of Table 38 are 
obtained by multiplying the deviation for any x variate l)y the 
deviation of the corresponding y variate, taking into considera- 
tion signs, + or — , of these deviations. The total of these 
product deviations is termed the sum o/ prodvcts or S,P. Where 
there are n pairs of readings, the sum of products will have n — 1 
degrees of freedom. Just as the mean sum of squares is known 
as the varianc^e, so the sum of products divided by the number 
of degrees of fr.ecKlom lias been termed the covananec. The cor- 
relation coefficient r, between tlm two variables x and is given 
by AhO; expression' . 


;'':V\/varianee;''a/:'.>(>ariance:.|^;y 

As the readings are in pairs, the number of degrees of freedom 
of each variance and of the covariance is the same, and therefore 
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COMREIATWN^ A, 

Table 38 .-^-Yielb of * Inches 

^ ' " ' lS83 :Ti:r1907 ^ 


■Ralnfalj 


nr£f 


Ta^.al for 

sei;t-«f)n, HO 
(uhovd! 


T / FrwRHrJ.' 

■ ''.iteviation 

A' 


fttloii 


mm 

issi 

imr 
1888 
1889 
1800 
I SOI. 

1502 

1503 
1804 

; 1893 
1800 
1807'. 
389i"' 
1899 . 
lf)0f>. ^ 
IPOl 

1902 ■ 

1903 
3001 
1903 
1,906 
3907 ■ 


Total 


■‘i-m 




Mean Taiiifjill 


+0.85 
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factor independently and on the extent to which the deviation of 
any given variate m reproduced in its opposite number. 

When a positive correlation exists, positive deviations in x 
will normally coim^ide with positive deviations in ?/, and the sum 
of products will have a high positive value. In a negative cor- 
relation, a positive deviation in x will normally be associated with 
a negative dewdation in y, and vice versa, and the sum of products 
will have a high negative value* When the variation in the two 
factors is entirely independent, positive and negative deviations 
for any pair of variates will occur purely by chance, and on the 
average of a large number of in^adings, tlie product deviations 
will tend to cancel one another, giving a relatively low figure 
for the sum of products. The correlation coefficient may take 
any value between +1 and —1. It is not affected by the units 
in which the variables are measured. If r is zero, the two factors 
are independent; while the nearer r approaches to ± 1, the greater 
the degree of correlation. The sign of r will be the same as that 
of the covariance and determines whether the correlation is 
positive or negative, i.c., whether an increase in the one factor 
is associated with an increase or with a decrease in the second. 


SIQNIFICAHCE OF A COltRELATrOH COEFFICIENT 

Here again, the data used to calculate r represent only a sample 
of the whole population, and the value of r obtained is therefore 
only an estimate of the true coefficient of correlation. To 
ensure even reasonable accuracy in this estimate, a relatively 
large number of variates are required. It has been demonstrated 
that for n » 100, a value of r of ±0.3 may be obtained purely 
by chance from two characters known to be entirely independent. 
In many experiments, the number of readings available is often 
of necessity very much fewer than this, and with small samples 
it is essential to apply a critical test of the significance of the 
estimated correlation coefficient. In a correlation based on n 
pairs of variates, the standard error normally attributed to r is 


either - 


Vn 


or* 


1 *— 


Fish^ points out that the correlation 


1 

coefficient may not be normally distributed and that, when the 
iample is small or the correlation high, this standard error does 
hoi provide a fair estimatOKof significance. With the relatively 

have perforce often to 
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100 


be UKed to estiinate the eorrckdion coefFicient., the expression 
for the standard error has to be modified to 

■ ■ 

■ V^n '—"S’ 

The number of degrees of freedom attributed to r has been 
redneed to n — 2, and the square root of 1 has been intrrj- 
dii,(;ed. This ino<lifi(>(3 (expression is tiierc’foie bonn<rto give a 
liiglier vahie than that calculated from the standard forrmdaj 
and, if used in (^injunction with the Tabic oft, it provides 
a criti((al lest of the significance of a (uirrolation coefricierit 
(‘valnat(!d from a limited immlx'r of pairs of obs((rvations. In 
this test, the Table of t repnjsents values of the correlation coeffi- 
cient in terms of tije standard error. Thus 

t (by calculation) = 

y/l - rH%/n - 2 

~ vTws 

If reference to the Table of t for degrees of freedom equivalent to 
n — 2 .shows that this calculated value of i corresponds to a value 
of P less than 0.05, the correlation coefficient may be considered 
significant. Applying this test to the data in Table 38, in which 
the number of pairs of reading.s is 25, 

t (by calculation) ^ 

♦ For 23 dcgree.s of freedom, the Table of t shows that the prob- 

ability of exceeding this calculated value pumly by chance is very 
much lcs.s than 0.01 . The n'ading of t for a =* 23 and P *= 0.01 
is only 2.87 as compared with the figure of 7.73 computed from 
the data. This correlation coefficient of -4-0.85 is therefore 
definitely significant, and it can be safely stated that, in the 
particular county to which the^data refer, high yields of maijse 
coincide with seasons of relativ«y heavy rainfall. 

EASY METHODS OF EVALDATIOH 

The short methods of computatiqmby squaring the variates or 
the deviarions from an assumed m&n can generally be used 
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with advantage in calculating the correlation coefficient. The 
estimation of the sum of squares of x and 7/ presents no new 
features. By the variable-squared method the sum of products 
is equal to 

By the assumed-mean system, the same expression holds good if 
the symbols x and y are taken to represent deviations from their 
respective assumed means. The application of this system to 
rather more complex data is shown in the next example. 

Example 19. Calculation of r from a Frequency Table by 
Assumed -mean Method.----Ideally t should be computed from a 
largo ntirnber of pairs of observations. Where this is practicable, 
the arithmetical work can be greatly reduced by grouping the 
variates into classes in a cerrctomn showing the frequency 
with which readings for a given class in x are distributed over the 
various classes of and vice versa. Table 39 is a correlation 
table of this type. The records again show the yield of maize and 
the rainfall over the same 25-year period. The data in this case 
have been collected from four new centers, giving a total of 100 
pairs of observations. It should be noted that such a correlation 
table bears a marked resemblance to a dot diagram. In this 
example, the arrangement of the frequencies over the squares 
enclosed by the table is certainly not a random one. No entries 
are located in the areas rejiresenting low yields and high rainfall 
or high yields and low rainfall. Most of the entries lie in a strip 
running diagonally from the first to the third quadrant, indicating 
a positive correlation between rainfall and yield of maize. If, on 
the other haiid, the frequencies in a correlation table appear to be 
Bcattered indiscriminately over all the squares, it is practically 
certain, that no signifi(?ant correlation exists, and the estimation of 
r becomes a w^ork of supererogation. In using the correlation 
table to form a rough idea of the existence or nonexistence of 
correlation in the data, it is not only the number of squares that 
are filled up that must be considered, but also the frequency 
attributed to each. Where the majority of the higher frequencies 
show some definite arrangement, a few single frequency entries 
outside this arrangement are not likely to upset the general trend 
of r^uits. In this table, correlation is apparently present, and 





Tbia value has been calculated from the rows and from the oolumna 
mdependeutly as a oheok oil the arithmetic. 
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the data have been used to estimate r by the assiimed-'inean 
mctliod. The oalcvilations are shown in full and should be 
self-explanatory. 


Mean yield (M^) - assumed mean of ?/ + 

S.S. y « 2:(/„ X dl) ~ = 377 


^ (/k X dp ) 
n 


13 “f~ 


11 


100 
13.11 bu. 


IP 

100 


Mean rainfall (Mz) — aHsnrned moan of a* + - 

11 + 

S.S. ar = S(/* X dl) ~ = 1,953 


375.8 

(/» X 4) 


10.73 in. 


1,945.7 


.n- ; 

-n 
100 ^ 

n’ 

The final column in the correlation table records the product 
deviations from the assumed means. To compute the true sum 
of products based on deviations from the real means of x and y, 
the total of this column has to be corrected. The correction 
factor here is 

The total deviation of a: X the total, deviation of y 

(from their respective assumed means) 

This value has got to be subtracted from the total of the last 
column of the table, taking into account the signs, po>sitive or 
negative, of the various values, 

In this example, 


11 X -27 


S.P. 

Correlation coefficient r 


+471* 


= +474.0 


S.P. 


V^:S. a; X S.S. y 
+474 


VI ,943.7 X 376.8 
+0.55 



COEIiElATION 


In this example n — 2 equals 98, and the Table of t gives only 
thr; iheondical distrilnition of < for dcgnui.s of freedom ranging from 
1 to 30. Alxjve this number of degrees of freedom, f appro.xi- 
mates to the valutas giv<m in the Table of jrfor the nr)rmal distribu- 
tion, and when n i.s largo, the Table of .t or the* reading from the 
Table of t for n = » may be validly itsed to te.st the signifuaince 
of r. 


'I’he Table of i shows this value to be .significant on a probability 
much fes than 0.01. With largcv.samplcs, this test is almost 
identical with the u.seof the ordinary stand.'ird error of r. The 
correlation of -f 0.55 i.s certainly significant proving once again 
that an increase in the rainfall tends to produce a rise in the mean 
yield of maize. 

STATISTICAL COMPARISON OP CORRELATION COEFFICIENTS 

When two or more independent e.stimate.s of the eoeflicient of 
correlation of a given population are available, it is often of .some 
importance to a.scertam whether they are significantly different or 
not. The distribution of r may not be norma], and the calculated 
standard errons in conjunction with the Table of t should not be 
used to determine whether the difference between the individual 
e.stimates of r is significant or not. It is possible, however, to 
express any value of r in terms of z, and as? z i.s known to be 
dist ributed normally, the .standard tests of significance based on 
the normal distribution as elaborated in the Table of x may then 
be validly applied. 

... - . log, (1 -!-?■)— log, n — r)* 

terms^of-a) : « “ 


If n represents the number of pairs of observations from which 
r has been estimated, the standard OCTor of z is equal to 

Significance of Difference between Two Estimates of r.—In the 
preceding examples, two estimates of the correlation between 
rainfall and yield of maize have been worked out, tna., 

* C/. 2 used in the eomparison of two varianees (p. 42). 
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Tx = +0.85 from 25 pairs of observations 
^2 = +0,55 from 100 pairs of observations 

The corresponding z values, as obtained by substitution in the 
above expression, are 

: ^i'=' 1.2561 . ' 

== 0.6184 
Difference - 0.6377 

Tlie standard (3rror of this difference zt — ^2 is from first 
principles the square root of the sum of the squares of the indi- 
vidual standard errors. ni and 7h are 25 and 100, respectively, 
and therefore 


Standard error of the difference zi — 0-1 


22 ^ 97 


0.236 


To be significant, a difference between values that are normally 
di»stributed must be greater than twice its standard error. The 
difference of 0.6377 ± 0.236 is therefore definitely significant. 
The actual probability can be read from the Table of x: 


X (by calculation) = 


0.6377 

0.236 


- 2.702 


Reference to the Table of x shows that thi.s calculated value of 
2,702 corresponds to a probability less than 0.01. This proves 
that the correlation between rainfall and yield of maize tends to be 
higher in the locality where the first series of records was taken 
than in the other centers. A possible explanation of this might 
be deduced from an examination of the major soil types in the 
difference areas. 

Comparison of Several Estimates of r from Same Population. — 
When a number of independent estimates of any correlation 
coefficient are available, as computed from different samples from 
the same apparent population, it is often advantageous to deter- 
mine the mean value of r for the whole of the recorded data. This 
not only provides a convenient method of summarizing the 
results but may also prove satisfactorily the existence of correla- 
tion in cases in which one or more of the independent estimates 
are nonsignificant. The mean coefficient of correlation is 
obtained by expressing^each estimate of r in terms of calculate 
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iiig tlie moan 5? from these and elmaging' this mieanv'h^^ ■to. the: 
corresponding value of n This tecliiiique m valid only when the 
total nninlier of correlation coefficients combincHl together to 
provide the mean value is srnall in comparison wit]} the ■miinber'of 
variates in the individual 'Barnples; Hayes and /Garber;*^, 
data, covering a number . of ' eonseeutive: yeaicH, • recorci a,, poaiti va, 
cnrndatiun between the yield of and the of tlic ijuib 

vidual grains. It ,is' possible To use the; data, quoted ' below, for 
three separate . .harvests 'to ealanlate : thc' mean /value" crf'- tliis' 
tmrrelalion. weffident. 


The corresponding value of r is calculated from 


r may be evaluated directly from the above expression with tlie 
aid of ordinary logaritlmis, l.)ut it is simpler to obtain the vtdui^ 
of by ascertaining from the table of Napierian logarithiuH the 
munbar. whose ■ Napierian:;!ogaritl}niia2^.y;In;ihe above, ex 
'0 JI396t'.nnd ^refelvmce^■ to : the/tabte.;^ 
shows that this is the logarithm of 2.559, 

Tm, the mean value of r « 2%59~4r^^ +0.4380 

1 

.% -is normally distributed, and its standard error is 

where p is the number of independent estimates of r, and n the 
number of individuals in each sample from which these estimates 
were derived. 

* Breodinp; Crop Plants,” McGraw-Hill Book Company, Iixe., Now York. 


mammMm 


'yc{ir 

No, of selections-, or 1 
samples from' which 
c^aeli r W’as (,a,tlcula.t;od 
(n) 

C^^rrelation / 

coefficient , ■ ■ . , 

,teflog, (1 

+ r):-T»g^ r)+ 

1014 

70 

+0.431 

+o.4a'!i ■ 

1015 

! 70 ■ 

■ +0.519: 

+'0.5T59';- . 


. ■ ■ 70 ■ 

„ +0.355 

+0.3723 

Total . 

^ 1 



3 .4093'^’ 




The oollectiou of data for the determination of the correlation 
coefficient between yield of wheat and sim of grain was continued 
for two more years, and the full data are recorded in Table 40 
along with the requisite calculations for the computation of the 
mean value (tm) for the &-year period. 


Standard error of zm 
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The standard error of Zm, in the above example 


V^3(70 -- 3) 


V201 


0.0705 


Zm is much greater than twice its standard error and therefore 
significantj proving that there is a definite positive correlation 
between the seed size and yield of wheat, with a mean value of 
over a S-year period, of +0.4380. 

The transformation of ^ to r can be simplified if Fisher's 
Table YB Statistical Methods for Research Workers^'), showing 
values of r for different values of z from 0 to 3, is available. 

The size of the different samples, from which the various 
estimates of r have been obtained, is not always the same, and 
when this occurs, it is necessary in calculating Zu to take into 
account the number of variates in the samples from which each 
individual z has been determined. In these circumstances the 
best formula to use is 




where p samples with ni, ng, ... rip variates, respectively, are 
available, giving values of r equivalent to ri, r 2 , . . . or 
z%^ . . . Zpf respectively. This will ensure that the final 
estimate of Zm is weighted correctly in accordance with the num- 
ber of individuals in the various samples from which it has been 
determined. 

The standard error of Zm will be equivalent to 

1 1 
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:m iniirfi’ greater' tiiaii twii’e its Maiida is tlierefora 

jigiiifieaiit. 

■ ^2. 2.259 . 


The iiieaii eoeffieieii t ■ of correlat ion Is; therefore +0.3863,; it raliie, 
whieh is deliiiitely ■ sigoifieaiit , ■, ■ fii eElmilhtiiig this trifmo ■ vahiOj 1t. 
is assioned ■ thatftho inde|:ieiic!eiitr estiooites. of' r,.. aS' :eviihi,atei:' 
E!iTi:oiillyj all lieloiig ixi tlet:BaiTie general any^ 

liilTtapiirrs b*! wefii lliein repreheni the ordinary" errors or^feadoTn 

'F AllltB. ■ 40 .-: -Ta'I JJES' ■■ OF, r a no r ■ FOE : CoElfmFA’iaOW -;BBt WEEK ■ ':toF 

' Wheat-'a^^o, SwB or Beeie l1H.4~-:-iei8'' i" . 


■' ■ , ■ , 


Total 


: ■ .'(0.4611 '^Mmm'+- OJ72e67(+'lk0624;'K'3f.''d^ 0,062§'":X"60; 
^"^'308 -8 'X 5’^ 

yy y-^:;;+^^ ,y 

Minipiing. It may happen tlait one or more of the estimatesi show 
ipv imexpeetecily large rieviatioo from the average valiio^ and it 
might he interesting to know whether these extreme valoes of r 
ean be regarded as differing significantly from the other estimatea, 
Tha Method of Testing Homogeneity of a Group of Ctorreia- 
tion Coefficients.~The best Index .to show 'the degreeof difference 
.'between a number of independent estimates of r is 

ZM)Hn — 3)} for all the sampim* 

This X* will have p — 1 degrees of freedom, where p i^ain reprch' 


. ;■ 

Correlation e«>« ' 

'' , No. iti siim.ple 

eIfk’leiitM for y ', 

',' W. 

each year '■ , 


. .f ■ ' 

70 

'■ ■+.431' : 

70 

(' ■ .+ .51f 

70 

I". .^'.^y'+.aaO:-:'';:; 

35 

I" :' :.+v:5B0:: ■■.' 

63 

■■'.y'';- +J00'"'' 
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sents the mmiber of independent estimates or samples from 
which has been eTaluated. 

Eeference to the preceding numerical example recording the 
correlation between yield of wheat and size of seed shows that the 
correlation coefficient for the year 1918 is much lower than for any 
of the other years* Does this particular estimate lie outside the 
range of values covered by the ordinary errors of random 
sampling? 


Tablk 4L— CALctrLAriDJT of x® fjIom Estimates of z Eecordko 
m Tablb ^O' ' ' ' 



Estimates of 

■Z" 


Z ZM 

n 

(s — ZM)^(n ““ 3) 


0.4611 

0,4705 i 

+0.0563 

70 

0.212 

Zt 

0.5759 


+0.1684 1 

70 

! 1.900 

Zs 

0.a723 


-0.0352 

70 

i 0.083 

, Zi 

0.6624 


+0.2549 

35 

2,078 


0.0620 


-0.3455 

i 63 

i 7.162 

Total. . . . * 



' ! 

■ 

■ , ,11*435 -X* ■ 


The Table of x" shows that, for the available 4 degrees of 
freedom (p *--* 1), a value of of 11,435 corresponds to a prob- 
ability of approximately 0.02, pro\dng that there is a significant 
difference between the individual estimates of 

'■ I . . 

The standard error df z is — so that the standard error 

Vn -|3 

of the difference between and St, or Sa ^ ~ 0.178. 

A difference between the estimates of z greater than 
2 X 0*178 — 0.356 is significant* 

The estimate of r for 1918 is significantly lower than that for any 
of the other yeans except 1916, It would appear therefore that in 
1918 some factor, possibly some climatic peculiarity, has reduced 
below normal the degree of correlation usually ejq^ected between 
the yield of wheat and the size of seed. 

It should be carefully noted that it is not valid to select one pair 
of z values out of a group of similar estimates and compare them 
by means of their standard errors until the test has demon- 
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Btrsted'that, differeiieebetWMi thaestimatea 

wljea tlar gipii|;) is aoji^idorr^cl m a whoia. ; Tlie' teafc^ tfierafore ' 
detamiiiias .whether the population, from/whialt the: varicma 
pies and asti.mates of r imve bfaai- obtained, is homogeiieonsor Beib'.. 

FARmL C0aB.ELATJ:OHS . . 

■ The corrcdatioo: ■coeffiaient::'heiwefai -two vvaiialdes'v/^ and B. 
measures the extent) to w'lii<dt A .rc,fsponds to; kBowJi cdianges iji; 
or ■ viae: rersa. . . In any researeh problem^ there ''will ■:«sHally ''te;, 
ottier ageoeitas, 0, B, E, ,ete., with 'wMeh 4 .is abo likely , to,' be. ' 
.eorrelatiad to a greater of' lesser ■'extent, in. orclt:;r 'to. olrtain ■an 
a(*curate i.iiulerjst;atM:liwg of ih.e.'var.i.ous phenomena alewerk' -on the 
<*haraeter Ay it 'is not euffifdent to base tiicioaonehisioiiH on. the' 
separate' ■ eorrelatioii : . eoeflieieii'te---/lB- 40, ■■ 4:l>:— •ealenlalBd' 
independently, ■ .The effect of. the one- agency may be sueh'as to' 
mask or cancel the - true- influence of the ' second. For 'example, ;' 
.he^iivy rainfali or, high summer temperatures inay both be ..pom*, 
tively .eorrela'ted with high" crop' -'yieldB-. It is quite 'possible,-- 
howavefj for wet seasons to 'be negatively corrdated' with'' bum- 
petrature, and this 'would almost certainly l(»ad to air afipareiit 
absence of eorr(*lation hetircBn yiehl and tmriperatnre. Whore ■ 
.any eharaeter under esa,mination is known to be? affected by. 
: various ■ :extemal -factors, :: it ^ : is ^ 'essen.tiah :t in .■ deterniining . "ahir ' 
■partieular correlation,; to make ; clue : allowance, for nil 4h'e- other, 
influential factors covered by the data. This is best effected 
by calculating wdiat is termiKl the partial eorrelatinn eor^fficient 
to distinguish it from thf.^ .tcAil eorrdalion foefficieiit ba-Hcd on 
;d,ata- f lotti , t wo, factors only, ;af .already ■-.disciissed, in -'the preerf-ing' 
;pEmgrapiis.,'.''''A:'. '':pnrf'fol'.:terrefelfoa.,,'mcasiir<3B-'v 
i'lelweehhiny tw 0 ;y ariables,, 4- ;ancl :B,;: whcu'idlurrei^ 

eteb'arekep'trconstant,;^ influeiiee 

■of'tl'utbal,aii.ef^of'thavarial,d6sis,effe(deddh/the;mathcan.ad.ical^ 
'.I'ationH, The first step is to work out tin^ lota! correlarion coeffi- 
cients for all possibie combiiiatioriH o! the variables takem. in pairs, 
e Xs,,.:Xs: tepres^mt;threc^ interacting' factor, ^ 

rss the respective .total correlation coefficients between each pair. 
Then the partial correlation rna l>et.ween Xi and X 2 , with Xi 
held constant, is given by the equation 





TECHNIQUE IN AGRICULTURAL RESEARCH 


, O tD 

■ iO O ■ CO. . 

■ t'-. O 

O 0 ;d' 
+ 4 “ I 


o 

00 00 
00 O i 

M id i 


i -f 

■c § 

bid ‘S ' 

•Si 

1.8803 

1.6904 

1.7401 


'•f 01 

a u 

O O 

p o 

CO 00 00 

^ , 

CO CO 


l»-< |i--< 

3 



«0 00 p 
^ p 
lO O iM 

o o o 

Hh 4* 1 


d d 
4“ i 


!>. C33 os 

jrH ItH (r-f 


00 CO 


«o 

CO 

323 

odd 


Example 20. Estimation of a Partial 
Correlation Coefficient— Yule records 
the following data showing the total 
correlations over a 20-year period 
between: 

L The yield of hay in hundredweights. 

2* The total rainfall in inches. 

3. The accumulated spring tem- 
perature. 


ri2 

ri3 

^23 


4-0.80, 

-0.40, 

-0.56 


It is desired to ascertain from these the 
true effect of rainfall and of temperature 
on the yield. In determining the two 
partial correlations from the equation, 
it is a good plan to work in logarithms 
throughout and construct a table of the 
type appended. 

The denominator of the algebraic 
expression is bound to be positive, so 
that the sign of the partial correlation 
will be the same as that of the numerator. 
The table in otherwise perfectly straight- 
forward and merely details an easy and 
accurate method whereby the value of 
the right-hand aide of the partial correla- 
tion equation can fee computed. 

The determination of the significance 
of a partial correlation is similar to the t 
test used for a total correlation coefficient 
with the proviso that the number of 
degrees of freedom from which t is com- 
puted must be reduced by a quantity 
equal to the number of factors that 
have been eliminated in estimating the 
partial correlation. In this example, n 
— the number of readings in each series 
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—is 20, and only a singlo c^iiaraeter is lir*ld eoimtant in eacli 
partial eorralatioin ■. Therefore for rn 2 . 1 ,; ; 


I .(by ealeiilatioB) 


0,750 X 


1 


'4.802' 


Ilefcrenee. to. Table of I : opposite ,17 degrees' of freedi^rn . shows that ... 
this.; value of i (a)'rmspoiuk .1o inarketily. less;!ban ; 

0.0i« This parlial e<a‘n*luli«Hi is definilely MgnilieaTit, A 
siioiiar test ap|)li<H'l to the of her two partial c^orrei.eJ ions r^i.- and 
r^HA hIua^s tlu^ui to l>e nonsigaifieaat- ns deifrroitied on n proln 
.ability of 0.05. rfs.s? the.'er)rndatioii coefficient- iMd'wc'en 'yield and 
sprbig tenipernlun^ witli the effect of rainfall fliinitiaied, is 
obviously negligible. Tliis. is mtlier' i:ri contrast to' the , total ■ 
correlation coefficient fo for the' j^anie two factors, -wdiose value ■ 
.of -"“d}.,40 approaches : the 'significant level (F —''0.1)7, 'appro:^!-. 
mately). It is even possilde that ihi* partial and total Cf^rrela- 
■ tioiiB .are .significantly , differiuit, and.,;,, as a test 'of "thiB, 
traiisformation.of the' r . values to a has been .carried out. / 


Tu 


-OM: 




ris.2 ==> +0.097 22 


log,. 0.00 — log* 1 .4(1 
log. L097 -- Jog. 0.903 ' 


■TS763/: 

''0.0974,,' 


'T)'ifference,,' 2i' ,'^ 


"0J2IT- 


The number of degrees of freedom is 18 and 1 7, j'cspecti vely, so ' 
that the standard error of 1 his difference is + Mt 0-^38. 
The difference does not exceed twice its standard error and is 
therefore not significant. 

■ From flaw results, it brfcomes obvioun tlmt tlie cHrnatie factor 
which 5 h of primary importance in influendng yiekl is the rainfall 
This eoneluHion illustTOies another important point in the intere 
pretatioji of correlations, ptz., the need of starting with some ■ 
logical hypothesii which will make it possible to separate, for any 
given correlation, the causative attribute from the dependent oub. ■ 
In this example, there is no doubt but that it is the rainfall which 
IB influencing the yield of the crop. With other data in ■which ■ 
close affinity can be proved between the attributes, a satisfactory 
evaluation of the re, suits may be impracticable on account of the 
imposaibiKty of defining whether it is X that m 'responsible for ^ 



V (1 ~ 7*13,4) (1 7^23.4) 

If, OXL the other hand, it had been desired to ascertain the correla- 
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changes in F or vice versa. For example, a positive correlation 
between root development and number of tillers might mean 
either that plants with numerous tillers develop a bigger root 
system or that a good root system encourages tillering. The 
mechanical computation of correlation coefficients is of little 
practical value without the necessary knowledge of the basic 
character of the attributes which alone will lead to a valid inter- 
pretation of results. A lack of understanding of this principle 
has been responsible for some misuse of the correlation weapon in 
the past, and, in some instances, has led to apparently striking but 
false deductions. These in turn have tended to attach to the 
correlation coefficient a certain degree of disrepute which is 
entirely unwarranted. There can be no doubt that in the hands 
of the ex^aert, the proper application of the correlation theory has 
added greatly to scientific knowledge, particularly in sociological 
and biological problems; also even with the novice, errors of 
interpretation may be safely avoided if deductions from the corre- 
lation coefficients are limited to examples in which the basic 
premises are known to be accurate. 

It is possible to extend the partial correlation equation to cover 
data in which more than three interacting factors have to be taken 
into consideration in estimating correlation effects. Take the 
simplest case in which the data show the corresponding n readings 
for four variables, and the partial correlation required is that 
between the first two factors with the last two held constant, 
viz,f T12.M* The first step is to calculate, for the four variates 
taken in pairs, all the total correlations 7*12, ris, ri4, r23, 7*24, ru^ 
From these, by substitution in the original equation, the three 
partial correlations TnA, nsA, nzA can be calculated. These 
represent the correlations between factors 1 , 2 , and 3 taken in 
pairs, when factor 4 has been eliminated. These values can now 
be used as simple correlations — as designated by the index 
numbers preceding the point in each r — ^between three variates 

1 , 2 , 3 , in order to assess the partial correlation between 1 and 

2 , when 3 is held constant. 

Thus, 



tioil '.between ' faolors 1 ' and' .3* ^ wlieii ' 2; and "4' -are. elimmated, tha 
' equation then becannes ^ 

' r,. X ^ 

The number of- degrees of freedmn ; of either 'aif. :t 
eorrelations fe n ^ .2 - 2 jii -geiKn^ah ^w 
been elimmated 5 '-:>i. — p — 2e . TestB .of ■eignifieanee\ata o*mca!^ 
.as; dencribed for total ' ca^rrelaiicnis,- - with: - the ^exefq::ilic>f-i -f# ' 
decrease in the number .of degrees-- of: freedom: -,n|a:fi:r n4iieii they, 
are based..' It wili tae readily lo'iderstood tluiir tbis; p?fMiess-:of 
eliminating unwanted, ''factors: one ■l:>y one. in: the iiitarp'n'4.atMni 
of complex i,Iata^ can be C5XterH,ied ihetioiljeidiy tci miimber; 
interacting factors. ■ ltj,s; w:ise^ however^ to bcair -in -mind' that 
each additional' factor will be restmnsible .for 'a marked and 
progressive increase in the magnitude of the arithinotical ealcula- 
lions. The following lal)les liav<^ b(‘en compilfnl lo facilitate the 
calculation -and interpretation of correlation coefficients : : : . 

'.^‘Tables of '1' — r® ami V^l. — 'r^^^tby J., R..' M'iner, 'Baltijiioret 

Table Values of r for different values of P and n/’ 

Table V/i. — “Table of r for values of j from 0 to 
“Statistical Methods for Research Workers/^ bv- li. A 
'Fisher. ''',-: . ■; 

■ ; , ; - IHm^CLASS ■ eOEEELATIOWS ' ■' '■ d 

lu the computation of the coefiicimit of la.^rrelaticiu from experi* 
mental data-, t.he pairs of readings from which r is deteruuned can 
.usually be corrcKdJy aIlocat(.K.i to two w(,i.b't}efinei.i. and. y-.^ 

in the eorrelation l:.H:?tW(K'‘n''yiehl tifid rainfall |iari.''nt and 
child, :'hc4ght: and age,' <d^''-'; ''If Ahc;^ complete/ I't^khoiiM 

not be |>ossib!e for a variate that rightly belongs 1o the x grou}) to 
biaxune iuclud<*d in the y gr<m|>. With otlHn types of data, 
may be impcmsiblo to tell from any character difference which 
rearling of any pair belongs to the ;r and wdiieb to tin* y group. 
'Rien becomes immaterial how*’ the allocation of the pairs between 
X and y is made. Thus m df3ieTmining the correlation between 
paired chromosorneB in the Bomatic cell, it might be impossible to 
- differentiate betiveen the individuals in any one pair. Or again, 

' 'twia ram lambs are obviously identical types, and in measuring 
the correlation between such, twins, no dassification to type is 


1.24 
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practicable. On the other hand, if one of each pair is a ewe and 
the other a ram lamb, the observations would naturally be 
grouped according to sex, x for the female and y for the male. 

Example 21. Computation of Interclass and Intraclass Cor- 
relation Coefficients for Twin Lambs, — When the pairs of read- 
ings cannot be accurately separated into two distinct and y 
classes, the method of cahiulating the coefficient of correlation is 
slightly modified to evaluate what is termed the intraclass 
correlation as distinct from the interclass correlation as previously 
discussed. As a means of illustrating the difference in procedure, 
the interclass and the intraclass correlation coefficients have both 
been worked out for the following data for the weight of twin 
lambs at 3 months of age. In the first calculation, the x readings 
are taken to be for ewe lambs and the y readings for ram lambs, 
when the interdass correlation is the one required. In the second 
calculation, all the twins are assumed to belong to the same sex, 
when no x and y classification is practicable and the intraclass 
correlation is the one to apply. 


Table 43, — Calculation op Intbeolass Coerelation Coefficient 
FOE Twin Lambs op Opposite Sex 


Peiuales (x) ' 

Males (y) 

X 

H 

Weight, kg. 

Deviation 
from mean 
of X 
(d.) 

d! 

Weight, kg. 

Deviation 
from mean 
of ?/ 

(dy) 

i 

dy 


+ 



+ 


+ 

26 

• By'^y-. 

9 

29 

2 

4 

6 

33 

4 

16 

32 

1 

1 

4 

20 

9 

81 

.:24' ' 

7 

49 

63 

■ \ M 1 

1 

■ : !■ 


2 

■ 4 



5 

."'25':^ 

28 

3 

9 

15 

33 

4 

16 

37 

.'y^y:.'"'-' .,'.6 

36 

',■"24' 


6 

86 

34 

3 


18 


3 

9 

-33, y- 

2 

4 

:■ 'fi 


2 

4 

85 

yyy:'^..''-.-;yy^^4 

16 

y'.'8' 

■y ■■;32V 

;-y;^v:'y:;:y;?:'^3 

9 


2 


■;'+6,'.,; 


-20 +20 

206 

yiyy/y31fi;i':':;:.y.y': 

!-16 +16 

136 

+14 +138 

Mean —29 

0 


Mean -31 

0 
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Interclass 
correlation, = 


S.R 


+ 124 


Standard error of r 


t (by calculation) = 


VS.S.a: X S.S.y 
. Vl - rl 


\/206 X 136 
Vl - 0.74P 


■\/n — 
0.741 
0.2375 ‘ 


V 10 


+0.741 


= 0.2375 


3.12 


For 8 degrees of freedom, this value of ^ is significant on a prob*- 
ability less than 0.02 

In estimating this interclass correlation, the sum of squares of x 
and the sum of squares of y are calculated separately by squaring 
the deviations from their respective iiieaiis. Where no such 
grouping is practicable, the corresponding a: and y readings ai\e 
interchangeable, and in Table 44, as an indication of this, 
the first entiy of any pair has been designated .r' and the second 
a;". In estimating the intraclass correlation, as the data are non- 
divisible, the sums of squares and products are based on deviations 
from the general mean of the whole 20 variates, i.e., of variates. 

In testing the significance of an intraclass correlation coefficient, 
it is necessary to transform r to z, using the expression 


- (1 + y*) - logc (1 " r) 


+ H log« 


n 


With an intraclass correlation, there is an unavoidable negative 
bias in the estimation of r and a correction has to be applied by 
adding to z the value of the final term in the equation, to., 

loge r* For the above example, 

Ttt i ' ' 

« - lo gc 1-63 - logc 0. 37 , 1 „ 10 

* 2 •T- 2 9 

= 0.7940 

Standard error of z as determined from an intraclass correlation 




1 


n - % 


0.343 


* Contrast this with the expression used for estimating the standard error 
of z for an intereiass correlation, viz., 


4 


n — 3 


Standard error 
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practicable. On the other hand, if one of each pair is a ewe and 
the other a ram lamb, the observations would naturally be 
grouped according to sex, x for the female and y for the male. 

Example 21. Computation of Interclass and Intraclass Cor- 
relation Coefficients for Twin Lambs. — When the pairs of read- 
ings cannot be accurately separated into two distinct x and y 
classes, the method of calculating the coefficient of correlation is 
slightly modified to evaluate what is termed the intraclass 
correlation as distinct from the interclass correlation as previously 
discussed. As a means of illustrating the difference in procedure, 
the interclass and the intraclass correlation coefficients have both 
been worked out for the following data for the weight of twin 
iambs at 3 months of age. In the first calculation, the x readings 
are taken to be for ewe iambs and the y readings for ram lambs, 
when the interclass correlation is the one required. In the second 
calculation, all the twins are assumed to belong to the same sex, 
when no x and y classification is practicable and the intraclass 
correlation is the one to apply. 


Table 43.— Oalcttlation op Iisttbrclass Correlation Coefficient 
FOR Twin Lambs op Opposite Sex 


Females (a;) 

Males (^) 

dx dy 

Weight, kg. 

Deviation 
from mean 
of OJ 
(d.) 

d; 

Weight, kg. 

Deviation 
from mean 
of 2 / 1 

(dy) 



+ 



■ , + 


+ 

20 

3 

9 

"■-'29 

2 

4 

6 

33 

4 

10 

32 

1 

1 '''I''' 

4 

20 

9 

81 

24 

7 

49 

63 

: 2S ' 

1 

1 

29 

2 

4 

■ - -■ 2 

V 24:., 

5 

25 

28 

3 

9 

15 


4 

16 

37 

6 

36 

■':'24 


6 

36 

34 i 

3 

9 

18 


3 

9 


2 

4 

6 


2 

4 


4 

16 

B 


3 

9 


2 




-20 +20 

206 


-16 +i6 

1136;' 

-14 +138 

Mean «»29 

0 


Mean- * 
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Standard error 
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Interclass 

correlation, 


S.P. 


Standard error of r. 


t (by calculation) == 


VS.S. a: X S.S.y 

Vl - 


+124 


\/206 X 136 


== +0.741 


•\/n — 
0.741 
0.2375 


‘2 V 10 

= 3.12 


0.2375 


For 8 degrees of freedom, this value of t is significant on a prob- 
ability less than 0.02 

In estimating this interclass correlation, the sum of squares of x 
and the siininf squares of y are calculated separately by squaring 
the deviations from their respective means. Where no such 
grouping is practicable, the corresponding a; and y readings are 
interchangeable, and in Table 44, as an indication of this, 
the first entry of any pair has been designated a?' and the second 
In estimating the intraclass correlation, as the data are non- 
divisible, the sums of squares and products are based on deviations 
from the general mean of the whole 20 variates, of 2n variates. 

In testing the significance of an intraclass correlation coefficient, 
it is necessary to transform r to s, using the expression 
, _ logi' (1 + y) - log. (1 - r) 


+ H log. 


With an intraclass correlation, there is an unavoidable negative 
bias in the estimation of r and a correction has to be applied by 
adding to z the value of the final term in the equation, viz., 

34 loge T- For the above example, 

7h i ' 

„ log. 1.63 - log. 0.37 . 1 10 

» -■ 2 
= 0.7940 

Standard error of z as determined from an intraclass correlation 


j 1 * fl 

‘Sjn-^ ' Vs.S 


== 0,343 


* Contrast this with the expression used for estimating the standard error 
of z for an interclass correlation, viz., 
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2 is normally distributed, so that, to be significant, it must exceed 
twice its standard error. In this case r^x" is therefore definitely 

Table 44. — Calculation op Intbaclass Correlation Coefficient 
FOR Twin Ram Lambs 


' . " ■ ' ' .1 

. x' ' ■ ■ ^ ! 

x" 

dx' y\ dx'* 

Weight, 

kg. 

(aiO 

Deviation 
from general 
mean 

(dxO 

<ii, 

Weight, 

kg. 

(x'O 

Deviation 
from general 
mean 

(4") 

di- 





+ 


+ 

26 

4 

10 

29 

1 

1 

4 

33 

3 

9 

32 

2 

4 

6 

20 

10 

100 

24 

6 

36 

60 

28 

2 

4 

29 

1 

1 

2 

. 24. 

6 

36 

28 

2 

4 

12 

33 

3 

9 

37 

7 

49 

21 

35 

5 

25 

34 

! ■ 4 

' 16 

20 

32 

, 2 

4 

33 

3 

9 

6 

27 

r 3 

9 

35 

5 

25 

15 

32 

1 2 

4 

29 

■ , 1 ■ 

1 

2 

290 

^—25, +15, 

216 

310 

-11 +21 

146 

-17 +131 







+114 


General mean = 
Total S.S. = 


290 + 310 


30 kg. 


20 

216 + 146 = 362 


S P X 2 

Intraclass correlation coefficient, = i - ' - j-'r - a s r 

’ total b.b. 

_ +114 X 2 
.362 


= +0.63 


significant, the actual probability as determined from the Table of 
I X being between 0.02 and 0.03 




Example 22* Computation of Intraclass Correlation Coeffi- 
cient for Triplet Lambs.~The estimation of the intraclass correla-* 
tion need not be limited to examples in which the readings are 




11 - 




1 
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recorded as 7i similar pairs. It provides an equally valid method 
of testing the coiTelation when the data are arranged in groups of 
3,4,5, . . . p similar individuals, each group forming, as it were, 
one family. Suppose, in the last example, that the data recorded 
had been for triplets and not twins and included the weights for 
the third member of each family as shown in Table 45. The 
deviations are again taken from the general mean of the 3n read- 
ings and the total sum of squares is calculated in the ordinary 
way. The sum of products is obtained by adding together the 
product deviations of the three members of each family taken in 
all combinations two at a time. 


Table 45. — Calculation op Intraclass Correlation Coefficient 
FOR Triplet Lambs 


a;"' 

Product deviations 

Weight 
of third 
lamb, kg. 
(^"0 

Deviation 
from general 
mean 

(4-) 

d|'" 

i ■ 

dx ' X dx " 

dx ' X dz "* 

dot " X dx "* 


- + 


~ ' +■ 

1 ■ - ■ ■ , + : 

\ ■- ■ +'. 

30 

0 

0 

4 

0 

0 

34 

4 

16 

6 

12 

8 

23 

7 

49 

60 

70 

42 

28 

2 

4 

' '2 

4 

2 

26 

4 

16 

12 

24 

8 

35 

5 

25 

21 

15 

35 

36 

6 

36 

20 

30 

24 

30 

0 

0 

6 

■ .'0: ■ 

0 

32 

2 

/ 4 

15 

6'' ■■ 

10 

26 

4;. 

16 

■2' 

8 

4 

300 

-17 +17 

166 

-17 +131 

-14 +155 

0 +133 




+114 

+141 

+133 


General mean 




290 + 310 + 300 


Zn 30 

Total S.S. = 216 + 146 + 166 = 528 

ST. = +114 + 141 + 133 = +388 


= 30 kg. 


If p represents the number of members in any family, the intra- 
class correlation, 
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S.P. for - — - -- series of product deviations 

T — — — 

total S.S. X 

In this example where p = 3, 

_ S.P. from three series o f product deviations 
total S.S. 


+388 

\528 


+0.7349 


Where the number in each family (p) exceeds 2, the best esti- 
mate of z is obtained from 


2=1 loge ^ - 


1 


- l)r 1, 

7— + 2 ;r 


When p = 2j e.e., when r is a measure of the intraelass correlation 
for pairs of -similar individuals, this formula reduces to that 
already given in connection with the data for twin Iambs. The 
same correction for the negative bias in the estimation of r is 
required. For the above example, 


1 , 1 + (3 

_log. ^ 


- 1)0.7349 1 , 

0.7349"“ 2 


10 


= 1.1685 


The standard error of z is, approximately, 




V 


2(p - l) (a - 2) 


4 . 


2 X 2 X 8 
0.306 


The e.stimated value of z is much greater than twice its stand- 
ard error, proving that the correlation of 0.7349 is definitely 
significant. 

Another method of arriving at exactly the same result is to 
carry out an. analysis of variance of the data. The total sum of 
squares for the Zn variates has already been calculated, viz., 628 
with 29 degrees of freedom. This total sum of squares can be 
validly split up into its two components the sum of squares 
between families and the error of sum of squares, i.e., the sum of 

* When n is small, this expression does not accurately evaluate the vari- 
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squares within families of three similar individuals. Either of 
them can be calculated in the usual way, and by subtraction from 
the total sum of squares, the second component can be assessed. 


Table 46. — Analysis or Vabiance op Data por Triplet Rams 


Factor 

S.S. 

' 

Degrees of 
freedom 

Variance 

Vi log, 
of vari- 
' ^ance 

z 

Total - 

528. 

29 




Ti /.-if in 7 A AT*) - 

434.67 

9 

48.30 

1.9387) 

1.1685 

JDwuW ViwA-L X 

Within families, i.e., error. . 

93.35 

20 

4.607 

0.7702f 


This value of z is significant on a probability less than 0.01. 
The two estimations of z — from the intraclass correlation and 
from the analysis of variance— are identical. Thus the z test as 
used in any analysis of variance is essentially a test to find out if 
the data show any significant correlation between similar indi- 
viduals, f.e., between members of the same family. If positive 
correlation exists, the readings for any one family will tend to be 
similar and, in consequence, the variance within families will be 
less tfian that between families; the z test proves whether this 
difference in variance — in other words, the correlation — ^is large 
enough to be considered significant or not. If no correlation is 
present, the variance within families will be of the same order as 
that between families. On the other hand, in the case of a 
negative correlation, a high reading of xf will on the average be 
associated in the same family mth a low reading of x”, and the 
variance within families will tend to be greater than that between 
families. The s test can again be used to test whether this differ- 
ence in variance is' significant, i.e., whether the negative correla- 
tion is significant. 

Unless an estimate of the actual correlation coefficient is 
required, the analysis of variance is not only the more accurate 
method of statistical interpretation but is also easier to evaluate, 
especially for high values of p. 


CHAPTER VI 


REGRESSION 

The regression concept is closely allied to that of correlation in 
that it is concerned with the way in which changes in one charac- 
ter or variable are reflected or dependent upon simultaneous 
changes occurring in some other associated variable or variables. 
The regression function is, however, of wider application than the 
correlation coefficient and, particularly in biological research, can 
often be used effectively in problems in which the latter statistic 
would have little significance. In many correlation problems, 
the reaction between the associated variables is not mutual in that 
one factor is the causative agency which produces by any change 
in value some measurable response in the second factor, the 
converse being an apparent absurdity. For example, rainfall 
and yield are often correlated, and this correlation is obviously 
the result of the influence of the rainfall on the yield and cannot 
*be due to that of yield on rainfall. Yield is then termed the 
dependent and rainfall the independent factor. In general, if x 
is the dependent and y the independent factor, the recorded 
values of for any one value of y, will be certain to show the 
ordinary variation occurring in any random sample taken from 
that particular population. In other -words, the recorded values 
of X for each value of y will tend to cover a range of readings, 
say ±^3 from their mean. The regression fu7iction is the one 
which expresses the amrage value that may be expected from the 
variates in one factor for any given value of the correlated factor. 

If the data were sufficiently extensive, it might be possible to 
estimate the mean value of a:, the dependent variable, for each 
value of y and to use these means to plot a graph of x against y. 
With adequate data, this graph will be in the nature of a continu- 
ous curve showing how x responds to measured changes in y or, 
expressed more technically, the regression of x on y. The simplest 
form that the curve can take is a straight line— the line of linear 
regression. This line is accurately defined .by the regression 
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equation which may be expressed as 

X ^ Ma + h^y{y -- My) 

where X = the average value of the x variates that may be 
expected when the value of the y variable is fixed 
at 2 /. 

Ma; and My — the means of the x and y variables, respectively, 
ba;y == the regressio-n coefficient of x on y, i.e., the number 
of units the ;r variable will change, on the average, 
for a unit change in the y variable. 

When the reaction between the correlated variables is appar- 
ently mutual, i.e.j when they cannot be effectively allocated to the 
dependent and independent classes, the regression of ^ on may 
also be validly computed and may be of considerable statistical 
significance. This second regression will usually give different 
values from that of x on y, the equation, in a linear regression, 
becoming 

F = My byx{x — Mx) 

where F == the average value of y for the given value of x, 
byx == the regression coefficient of ^ on a;. 

ESTIMATIOK OF COEFFICIENT OF REGRESSION 

Example 23. — In experimental work, the data are seldom com- 
plete enough to fix the regression graphs exactly, but they will 
often suffice to fix a curve which will approximate sufficiently 
closely to the true one to indicate the general trend of the results. 

Table 47 is a correlation table for 100 oat plants in which the 
number of culms per plant has been recorded against the corre- 
sponding yield of grain in grams and the value of the interclasB 
correlation coefficient has been determined. In this example, it is 
presumed that the dependent factor is the yield x as influenced 
by the independent factor y, ie., by the number of culms per 
plant. In the last three columns of the first half of the table, the 
average yield of all the plants in each of the six culm classes (2 to 
7) has been computed and these figures used to plot Fig. 7,4. The 
plotted points determine, for the recorded data, the location of 
the regression graph of yield x on number of culms per plant y. 
There are some obvious irregularities, but the points apparently 
tend to be located along a straight line— the line of linear regres- 


47.* — CoHRBLATi03sr Table for Yield op Orain’ and Hotbeb op Culms in Oats 







BEGEESSION 

Ta31.k 47. ^-—-(.Conimued) 


133 


Deviation from as- 
sumed mean yield of 
4 gm. (dx) 

-3 

2' 

- 1 

0 

4- 1 

4-2 

4-3: 

+ 4 


fx X da;. . 

-a: 

--24 

-251 

' ^ ' ! 

4"21 

'4-22' 

4-3: 

+ 4 

4-505 ^ 

■172 

fx X 

9 

1 

' 48; 

25| 

■ 0 

21 

■ ■ 1 

'44| 

■ Q\ 

\ 

16 


Gomputation of S.B. x 


Product deviations 


Iiidividiial f re.quericie.s 
X deviation from 
assumed mean of y 
2J(/Xdj,)....... . . . 

Product deviation 
d. X S(/ X dy)..... 

—2 

-f6 

-18 

4-36 

-25 

4-25' 

-10 

4-2 

•4-2 

1 

+ 7 

4-14 

4-1 

4-3 

+ 3 

4-12 

1 4-9S ' 




Data for regres.sion graph y mi ; 

r- . 

No. of plants in each 




1 




1 

1 


yield class, i,e., fx . . . 

1 

12 ' 

25.;' 

28 

21 

11 

1 

1 

100 

Total no. of culms in 










each, yield class, 










S(/X2/) 

2 

30 

75 

102 

86 

51 

^ 5. 

7 

368 

Average no. of culms 1 










per plant for each 










* 1 7 f ^(/ X li) 
yield class, . 

Jx . 

2.0 

2.5 

3.0 

3.64 

■'■4.1 

4 . 64 

5.0 

7.0 



After iiove and Leiglity. 

Mean yield, M* 


S.S. a; = 172 
Mean culm no. = 4 ~ 
S.S. y = 114 


~ %oo = 3.98 
(-2)^ 


100 
‘%oo = 

lod " 


= 171.96 

3.58 

96.36 


S.P.a^y=+98-(~^j-^) 


97.16 


Correlation coefficient r 


+ 97.16 


V'171.96 X 96.36 

[Explanation continues at foot of page 1S4.} 
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sion — which nins in the median position between the plotted 
points. If the regression can safely be assumed to be linear, it 
is possible to calculate a statistic from the original records by 
means of which the line of best fit to the plotted points can be 
accurately determined within the limits of the prescribed data. 
The line of best fit is the one which conforms to the principle of 
lead squares, which ‘stipulates that the sum of squares of the 
deviations of the plotted points from the line must be at a mini- 



7 A , — liegrossion graph of yield of grain in number of culms in oats. 

mum. The statistic required to fix this line is the coefficient of 
regression b or, more precisely, where x is the dependent and y 
the independent variate, the coefficient of regression of x on y, 
mz.) 


Regression coefficient of yield on no. of culms. 
,, _ xy _ mM 

. '.Pai/ ■ — ' — 


+ 1.01 


S,S. y 96.36 
Regression coefficient of no. of culms on yield 


_:S.P.:a;y!_;:+97.M 


S:.S.® 


171.96 


= +0.57 



REGEEBSIOH 


The regression coefficient of yield on number of culms 


This indicates that a deyiation of +1 from the mean number of 
culms is equivalent, on the average^ to a deviation of +1.01 grams 
from the mean yield; or, expressed in the form of an equation, 


where dtf represents any given deyiation from the mean culm 
number and the corresponding deviation from the mean yield 
that might be expected on the average of a large number of 
readings. 

By entering the vertical and horizontal axes Mx and My inter- 
secting in the point fixed by the coordinates of the means of ^ arid 
y, it is possible to use this equation to locate accurately the line 
of best fit to the plotted points, nc., the regression line of x on y. 
In fixing this line, the coordinates of the points corresponding to 
deviations of +3 and '—3 from the mean culm number have been 
worked out from the equation, making the corresponding average 
(kwiations from the mean yield 


These are the coordinates of the points A. and B in the diagram. 
Therefore, within the limits of the recorded data, the straight 
line AB represents the linear regression of yield on culm number. 
It can therefore bo used to determine what the average yield 
of grain is likel}^ to be for any fixed number of culms per plant. 
It is probably better to work in the absolute units in which the 
variates are measured instead of in deviations from the means. 


then, by substitution, using the annotation given earlier in this 
chapter, 

{X -- Mx) - hxy(y M,) 
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and- ' 

X -- Ah + kxuiy ~ My) 

Thus, the general equation already given for the regression func- 
tion is again derived. For the calculated values for the regression 
of 3deld on number of culms (Table 47), 

J = 3.98 -H 1.01(2/ - 3.58) 

= 1.012/4-0.36 

Thus, if the number of culms y is known to be six, the average 
yield that may be expected is 

1.01 X 6 -4 0.36 = 6.42 gm. 

These values represent the coordinates of the point C on the 
regression graph (Fig. 74). 

Mathematically, the same data may be used to calculate the 
regression of culm number on jdeld, f.e., of y on x. This has 
actually been done, and the regression is again apparently linear 
(Fig. 75) with a regression coefficient hyx = 4-0.57. The equa- 
tions for this second regres.sion function are 

dy = X dx 

or, in absolute values of the variates, 

F = + hUx - Mx) 

where Y represents the average number of cuhns for any fixed 
yield. Theoretically, for any given yield x, the average culm 
number should be 

3.58 + 0.57(:c - 3.98) = 0.57* 4- 1.31 

It is obvious from the nature of the data that the yield of grain 
cannot determine in any way the number of culms developed by 
the plant, and therefore these mathematical expressions have no 
real meaning when applied to thi.s particular problem. This 
effectively illustrates the futility of appljdng statistical formulas 
more or less indiscriminately to any data. Some basic knowledge 
of the fundamental character of the various attributes under 
examination is essential to an accurate interpretation of results. 
In the application of the regression theory, it is important to 
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distinguisli betweeii the dependent and independent variates or 
to know to wliat extent they are niiitiially responsive. 

The various facts discussed in Example 23 illustrate some 
general truths applicable to linear regressions. The , coefficient of 
regression is tlie tangent of the angle that the regression line 
makes with the, appropriate x or ?/ axis of the graph, depending on 
whether the x or the y variable is the independent factor,^ For 
any two complementary regression linesj the line with the smaller 


Yield in grams 

Fia. 7B , — Regression graphs; yield of grain on nnmber of eulms in oats and 
number of culms on yield of grain, in oats. 

inclination to the x axis has x as the inder^pendent variable, and 
is thf.^ tangent of the angle that this line makes with the x axis. 
Similarly, the line with the smaller inclination to the y axis has y 
as the independent variable, and represents the slope of this 
line to the y axis. Thus in Fig. 7B, 

■ byx ■ tan oc . 
bxy — tan 

For any one pair of variables, at least one and possibly both 
coefficients of regression will be less than unity. They wdll have 
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the same sign, both positive or both negative, the sign being the 
same as that of the covariance. If the graph on which the regres- 
sion lines are plotted is divided into four quadrants by axes 
intersecting at a point whose coordinates are the means of the 
two variables, the regression lines in a positive correlation will be 
located in quadrants I and III and in a negative cGrrelation in 
quadrants II and IV. The more clo>sely the regression lines 
approach one another, t.e., the more acute the angle between 
them, the closer is the correlation between the variables, until, at 
a correlation coefficient of d:l, the two lines coincide. In con- 
trast to this, when r is in the region of zero, the regression lines 
intersect at approximately 90 degrees. They will always cross 
at the intersection of the axes through the means Mx and My, In 
Fig. 7B the angle between the regression lines is acute, indicating 
fairly high correlation; the graphs lie in the first and third 
quadrants, and the correlation should be positive, its actual value 
by calculation being +0.765. 

SiemFICANCE OF REGRESSION FUNCTION 

" In addition to defining the relationship between two variables, 
the application of the regression function to certain types of 
research data will often amplify the resultant conclusions by 
demonstrating any progressive change occurring in the data or by 
producing a valid reduction of the error variance. For example, 
in experiments with crops that, are repeatedly ratooned, such as 
semiperennial pasture or fodder crops, there will often be a 
tendency for the jdelds from successive harvests to show a 
gradual decline. This may be a result of the senescence factor in 
the plant or of a gradual reduction in soil fertility or of both. 
The significance of any such general trend in the variates can be 
effectively assessed by means of the regression function. 

Example 24. Use of Regression Function in Interpretation of 
Results. — ^Table 48 records the yields obtained from an experi- 
ment with a fodder crop of guinea grass in which the grass was 
harvested once per month over an 8-month period. Do these 
figures indicate any Ksignificant drop in yield from the first to the 
last crop? In this experiment, it is the time factor or age of the 
ratoon that is thought to be affecting the yields, and the regres- 
sion of yield, .t, on age, y, provides an effective test of any signifi- 
, cant downward trend in the yield data. 
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degrees of freedom may be validly allocated to the coefficient of 
regression. 

n > T i- 7 / 2(a: — X)® 

Standard error of hy = 'V’ (» - " 2) 1^837^ 

In applying this expression to the data for the guinea grass, the 
first step is to work out X for each value of y in order to estimate 
2(a: - X)-V • 

From the regression equation, 

X = 45 - 9.76% “ 4.5) 

and therefore 

X = 88.93 - 9.76y 


, f 



From' this equation Table 49 has been compiled. 

Tabib 49. — Calculation op 2(a: — X )^ fob Data op Table 48 


A quicker method of arriving at the same result is to substitute 
the appropriate values in the identity, 

2(a: ~ Xy = S.S. x-blyX S.S. y 

Therefore, 2(a; ~ X)^ = 4,788 - (-9.762)2 X 42 

== 4,788 4,002.4 = 785.6 (as calculated 
' above) 

It is important to note that, in using this short method, b is not 
only squared but multiplied by the sum of squares of y 

which may be a rel^VOly large number. Therefore, to ensure 
ioecuracy, the value of.6 must be taken to several places of 


iiii» 
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decimals. As an alternative, the equation can be expressed in 

BP XV 

another form as follows: bxy is assessed from so that 

D.D. y 

S.S. X — bly X S.S, y reduces to 


and when the necessary .sums of squares are available;, this last 
is the simplest equation to use in estimating 2{x — XyK 


Standard error of h, 


The significance of can now be determined by calculating t 


standard error of 6, 
9.762 . 


Reference to the Table of t opposite n = 6 (the degrees of freedom 
of the regression function) shows that this value of i corresponds 
to a probability less than 0.01. bxy is therefore highly significant, 
proving that the later ratoons show a definite falling off in yield. 

Short Method of Computing Sum of Squares When Variates 
Are in Arithmetical Progression.— In Table 48, the variates of the 
independent factor y form a regular sequence of numbers in 
arithmetical progression. This is not an uncommon feature of 
research data from which correlation or regression coefficients arc 
evaluated, and the following simple method of calculating the 
sum of squares is worth noting as it effectively reduces the amount 
of routine arithmetic involved: 

For any variable y whose n variates are arranged in a regular 
sequence at equal intervals of i units, 

52 52 T-D \/ A a 


Thus for the data of Table 48; 


X - 42 (as originally calculated) 



COMPAKISOH OF INDEPENDENT ESTIMATES OF COEFFICIENT 
OF REGRESSION 

Example 25. — The experiment from which the guinea grass 
data (Table 48) were extracted also included yields from 10 crops 
of elephant grass, as recorded in Table 50. 

Table 50. — Yields of Ten Ratoon Crops of Elephant Grass 


Age of 
crop, 
months 


Monthly yields of elephant grass, 
kg. per kio 


S.S. rr = 14,166 


Alternatively, by the short method, 

Q Q . 10(10’ 


= -g|f = -4.158 
Z(x~ Xy = 1,916 - § 1 ^" = 490 

■s/s X 82.5 ^ 

4-158 ^ 

0.861 


Standard error of 


t (by calculation) 
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The probability of exceeding this value of ^ purely by c]}ance is 
less than 0,01, as determined from the Table of t for ?i = 8, This 
proves that with the elephant grass also there is a progressive 
decline in yield mth successive crops from the same stools. 

The value of the coefBcient of regression for the guinea grass 
yields is more than double that of the elephant grass. For the 
former variety the yields range from 85 to 15 kilograms, while for 
the elephant grass the range is only 55 to 18 kilograms. It might 
be of advantage to ascertain whether or not these data indicate 
that the rate at which the yields are declining is greater in the case 
of the guinea grass. To test this, it is necessary to determine 
whether the difference between the respective regression coeffi- 
cients is significant or not. The coefficients of regression W’hich it 
is desired to compare have been estimated from two distinct 
series of readings,; one series for guinea grass and the second for 
elephant grass. Ip a simple analysis of variance applied to the 
yield data x of thil fodder grass expemnent, the withiii-series or 
error variance would be evaluated from the aggregate of the sums 
of squares coinpiited from each series independently. For the 
yield data alone, 

Error S.S. = 4,788 with 7 + 9 degrees of freedom 

= 6,704 with 16 degrees of freedom 

Similarly, the whole of the recorded data should be used in 
calculating the sum of (x — Xy from which the standard errors 
of the estimated coefficients of regression will ultimately be com- 
puted. Therefore, for this experiment, 

2)0^ — Xy = 785.6 + 490.0, wii.h 6 + 8 degrees of freedom 
(Guinea (Elephant 
grass) grass) 

= 1,275.6 with 14 degrees of freedom 

The standard error of any^ coefficient of regression bxy is evaluated 

from the expression valuevS of 

2(x — xy and of n — 2 to substitute in this formula are the 
aggregate ones obtained from the wffiole of the available data, as 
calculated above. These aggregate values of S(:r — Xy and of 
n — 2 may be validly used in calculating the standard error for 
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any of the separate coefficients of regressions computed from the 
data. In this experiment, there are two estimates of the coeffi- 
cient of regression, ws., 


(a) 


hxy for guinea grass = 
for elephant grass = 
Difference, i> = 


-9.762 

-4.158 

5.604 


It is desired to ascertain whether this difference between the two 
regression coefficients may be regarded as significant or not. By 
substituting the appropriate numerical values in the expression 
for the standard error of a regression coefficient. 


^ , , .. 1 1,276.6 /91.11 

(a) Standard error of 6,j, = -V I4~>ri2 ~ '\'~W 


(h) 



Standard ei’ror of bx 


-- 4 . 


1,275.6 _ MU 
14 X 82.6 T y 82,5 


From first principles, the standard error of the difference is the 
root of the sum of the squares of the individual standard errors, 
and therefore 


Standard error of D 




11 , 91.11 


+ 


t (by calculation) 


42 ' 82.5 

R =x 

Ed 1.809 


= 1.809 


3.097 





The available number of degrees of freedom of S(a: — Z)® from 
which t was calculated is 14. The nearest reading from the Table 
of t at this level of n is 2.977 for P = 0.01, proving that the differ- 
ence between, .Jffie regre.ssion coefficients is highly significant. 
This shows tli^lwith successive ratoon crops, the sricld of the 
guinea grass is falling away more rapidly than that of the elephant 
gras.s. 

LINEAR REGRESSION COMPONENT OF VARIATION 

When one variable x shows some measurable response to 
changes in a second variable y, the dispersion of the x variates 
must represent the combined effect of the variation induced by 
the independent factor y and the ordinary errors of random 
sampling occurring in the dependent factor x. These two. com- 
ponents of the total sum of squares of x represent, respectively, 
Ite regression of a: on y and the deviations’ from this regression, 
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both of which may be accurately eompixted. The second 
component, i.e., the sum of the squares of the deviations from the 
regression line, is obtained by using hy to evaluate I^{x - 
where X is the expected value of a: — as determined from the 
regre>ssion equation—for each recorded value of It has already 

been shown that 


X)2 -= S.S. 


(S.P. rr;?/)® 

The number of degrees of freedom of this component of the total 

S.S. X will be n — 2, where 7i is the number of variates from which 
the S.S. X was computed. The first component — the regression 
of .T on y~nmst account for the balance of the total S.S. and of 
the total available degrees of freedom n — 1. 

Therefore 


S.S. linear regression 


S.S. -- 
(S.p. xyy 


S.S. X 


(S.P. xy)^' 

S.S. y “ . 
(S.P.) 2 


S.S, y S.S. of independent factor 
with (n -- 1) — (n — 2) = 1 degree of freedom 

The following data for the yield x as recorded agaiast the age y 
of a crop of guinea grass have been extracted from Table 48 to 
exemplify the practical application of this technique. 

S.S. X = 4,788 with 7 degrees of freedom 

S.S. y — 42 with 7 degrees of freedom 
S.P. xy = —410 

In this example, the yield x is the dependent factor; hence the 
sum of squares of x represents the aggregat/C of the sums of 
squares attributable to the linear regression component and the 
deviations from this regression. The full analysis of the sum of 
squares of x is appended; 


Factor 

S.S. 1 

Degrees 
oi freer 
doni 

Ynrianee 

F 

Linear regression 

Deviations from re- 



4,002.41 


gression 

J4,7S8 -4,002.4 « 785.6 

i 

i': ''V' '■: 
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The sum of squares of the deviations from regression 785.6 is 
identical with 5) (re — as already calculated in estimating the 
standard error of bxy for the guinea grass data. In fact, the above 
analysis provides an alternative method of testing the significance 
of hxy, the coeflScient of regression of yield on age. If b^y is signifi- 
cant, the linear regression variance will be much larger than the 
unavoidable errors of random sampling as measured by the devia- 
tions from this regression. The F or the z tost may be validly 
used to determine whether the two variances are significantly 
different, f.c., whether b^y is significant. Here, F = 30.59 as 
compared with a reading of 13.74 from the Table of F for 7ii = 1, 
n 2 === 6, and P = 0.01. b^y is therefore significant on a proba- 
bility much less than 0.01, precisely the same conclusion as was 
originally obtained by calculating t from the standard error of 
hzy^ Both tests are bound to give exactly the same result, and 
in any particular example, the easier one to evaluate should 
be used in preference. 

HEDIJCTIOl? OF ERROR VARIANCE BY MEANS OF REGRESSION 

In agricultural research, complete control of all the external 
factors likely to have an influence on the recorded data is not 
generally possible. When the simultaneous variation occurring 
in any such external agency can be effectively computed, the 
linear regression component of the error variance of the dependent 
factor may be regarded as a fair measure of the influence of the 
independent factor on the estimate of error. The variance of the 
deviations from this regression may then be validly used to deter- 
mine the significance of differences betw^een treatment means. 
The data for the fodder crop experiment with guinea and elephant 
grass (Tables 48, 50) effectively illustrate the advantages of this 
technique in practice. Consider first the ordinary analysis of 
variance of the yield data alone, Xj ignoring for the present the 
age factor. 

Total S.S. = 20,988 + 14,166 — == 7,148.5, with 17 de- 

grees of freedom 

^ 5*^02 7102 

Variety S.S. = -g — I- -jg jg- = 444.5, with 1 degree of 

freedom 




I' ^ 
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This enables Table 51 to be compiled. 


Table 5L — Analysis of Vabiancb of Yieli> Data for Fodder Grabs 
Exx^emment (Tables 48, 50) 


1 

. ■ . ' ! 

Factor 

, 

as. ■ 

Degrees of 
freedom 

! 

Variance i 

■ ] 
I 

F 

Total. 

. 7,148.5 

17 



Variety. 

^ 444,5 

1 

.444.5?; 

1.06 

Within variotv, i.e.^ error i 

6,704.0 

16 

419.04 i 

■ ■! 


The variety varianee is obviously not sigiiificaiitly greater 
than the error variance, which would mdicate that there is no 
significant difference between the mean yields of the guinea grass 
and the ek^phant grass. The respective mean values are 45 and 
35 kilograms per plot, and the difference between the variety 
means is therefore approximately 25 per cent of the general mean. 
It is at first surprising that a mean difference of this magnitude 
is not significant, but closer inspection of the data shows that 
the reason is the rapid falling off in the yields with the increas- 
ing age of the crop. This in turn is responsible for excessive 
dispersion of the variates resulting in an unduly large estimate 
of error and leading to the nonsignificant result quoted above. 
It is X)ossible to discount the effect of age on the yield data by 
extracting the linear regression component from the error 
variance, so as to leave a reduced estimate of error, equivalent 
to — X)^j t.e.j the sum of the squares of the deviations 
from regression. In this example, — Z)" as already evalu- 
ated (Example 25) is 1,275.6 with 14 degrees of freedom. The 

reduced error variance is therefore — r:= 91.11. The 

standard error of the difference between the mean yields of the 
guinea grass and the elejxhant grass now becomes 


91.11. 


1.11 ,91.11 
8 10 


The mean difference is 10 kilograms so that t by calculation is 
‘|-gg = 2.208, Reference to the Table of t at the available 14 


degrees of freedom of the reduced error variance shows that this 
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value of t corresponds to a probability between 0.05 and 0.02, 
proving that the guinea grass has given a significantly higher yield 
than the elephant grass. Thus, the elimination of the influence 
of age on the yield data has been effective in reducing considerably 
the estimate of error. This can be expected only when the regres- 
sion is significant, i 6., when the linear regression component of 
the variance is definitely larger than the deviations from 
regression. 
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The analysi*™ of covariance is a term used to define the statisti- 
cal technique by means of which the complete analysis of the 
simultaneous variation occurring in two or more correlated 
variables is effected. It is a more exact, if rather more intricate, 
method of diseoilnting the influence on research results of changes 
occurring in some measurable but uncontrollable external factor. 
In the analysis of covarianeej the regression component is 
extracted, not only from the error variance, but also from all 
the other component factors of the total analysis of variance. 
Furthermore, the regression equation is used to provide a refused 
estimate of the treatment means, adjusted so as to compensate 
for the variability of the independent factor. It is proposed 
to use the same data (Examples 24, 25) to exemplify a simple 
analysis of covariance. The first step is to make out a 'table, 
of the type given below, showing the sums of squares and the 
sums of products for the total and for each of the components 
in the analysis of variance of the dependent factor x and of the 
independent factor y. Some of the required sums of squares and 
sums of products have not been previously worked out, but it is 
presumed that the student can by now carry out any of these 
routine calculations for himself from the original data of Tables 48 
and 60. For example, for the age factor y, 


.,S,S. variety — S 


(variety totals) ^ 


no, of variates in each total 


C.F., f.e., 
Grand totaP 


36" , 55" 
8 10 
4.444 ' 


91" 
18 ^ 
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-797.44 

In this way, Table 52 showing the full analysis has been 
compiled. 

Table 52. — Analysis of Vakiance and Covariance for Fodder Grass 
Experiment Eecordbd in Examples 24 and 25 


Deviation -s 
from 

regression 
Zix - Z)2 


Degrees 
of free- 
dom 


Reduced 

variance 


Factor 


44128.94 - 
44 4.444 — 

0 124.5 - 

Residual 2(0; 


Total . . 
Variety 
Error. . 


The last three columns in Table 52 require some further 
explanation. 'It{x ~ is calculated from the identity 

, (S-P- 

S.S. a: - g_g_ y 

and is evaluated separately for each line, le., for each factor in the 
analysis. It represents the balance of the sum of squai-es of x for 
each factor after the regression component for that particular factor 

(S.P. xy)^ 

has been deducted, the regression component being — 

The regression component in each case accounts for 1 degree of 
freedom, and the number of degrees of freedom of each E(a? Xy 
is therefore one less than that for the corresponding vsum of 
products. It will be noticed that, for the variety factor, the 
deviations from regression and the number of degrees of freedom , 
are both zero. With only two treatments, this will always be the 
case, as obviously with two values, the line of best fit is the 
straight line connecting them, and the deviations from this linear"^ 
regression will therefore be nil. When there are more than two 
treatments, the for the treatment or variety compo- 

nent will generally show a numerical value, as it represents the 
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I deviations of the treatment means from the line of regression 
fitted to them. 

Before carrying out an analysis of covariance with a view to 
it improving the estimate of error, it is advisable to make sure that 

s the regression coefficient olxouy is significant. If the regression 

a is nonsignificant, there is not likely to be much advantage in 

0 proceeding further with the covariance calculations. The best 

c value of the coefficient of regression comes from the error line of 

u the table, and in carrying out an analysis of covariance, this line 

ti should be calculated first and the significance of tested. In 

II 753 0 

^ this example, the appropriate = — 6.06. Its signifi- 


cance may be determined in the usual way by calculating the 
standard error, but it is probably simpler here to use the F test 
to compare the variances of the regression and the deviations 
from the regression. The variance attributable to this linear 
' 753 ^ 

regression is = 4,554.3; or alternatively 6,704 — 2,149.7. 
4 654 3 

F is therefore 143. Y = 31.78. The reading from the Table 


= 31.78. The reading from the Table 


of F for ni = 1 and = 15 and P — 0.01 is only 8.68, proving 
that 6ij, is definitely significant. 

It will be noticed that, in the analysis of covariance (Table 52) 
there is a residual S(a:'— XY to which the single remaining 
degree of freedom in the penultimate column can be validly 
allocated. This residual variance actually is a measure of the 
difference between the regression coefficients of the variety and 



error components. It is this residual variance which has to be 
compared with the corresponding variance for error by the F 
or z tests in order to determine whether there is any significant 
difference between the treatment means of the dependent variable 
X after they have been adjusted or corrected for age inequalities 
by means of the regression coefficient h:^. In this example, the 
comparable reduced variances are 


Error *143.3 

Residual, i.6,, difference between regressions 66 . 9 



where Xt = any treatment mean of the dependent factor. 

&XJ/ = the coefficient of regression from the error line of the 
analysis of covariance. 

yt — the mean value of the independent factor correspond- 
ing to Xt- 

My = the general mean of the independent factor. 

The adjusted mean yields for the fodder grass experiment 
will be: 


Guinea grass === 45 — ('-- 6.06) (4.6 — 6.06) 
- 45 - 3.33 - 41.67 kg. 
Elephant grass — 35 *- (—6.05) (5.5 — 5.05) 
35 + 2.72 37.72 kg. . 


When the mean variety yields are corrected for the age factor, 
the difference between them is reduced from the original 10 
kilograms to 3.95 Mlograms. It is this difference which the F 
test comparing the residual with the reduced variance in the^ 
covariance table has shown to be nonsignificant. 


HEGRmsWN 


For these data, n-r — 15 and n 2 — 1, corresponding to a reading 
from Table of F of approximately 245. The difference between 
the variances is therefore quite insignificant. It must therefore 
be assumed that, when the mean yields of the two fodder grasses 
are equalized for the age f actor there is no significant difference 
betw^een them. 

This conclusion is apparently contrary to that obtained 
when the actual mean yields were tested by the error variance 
with the linear regression component deducted. However, 
both conclusions are logical, and in order to demonstrate w’^hy this 
is so, it is necessary to calculate the values for the variety means 
corrected for age. If represents the mean age recorded for 
any given treatment, from the regression relationship of yield 
on age, the expected corresponding average deviation — from 
Ma?, the general mean of the dependent variable x — will be 
hyiVi -' My). The values of b^^yivt My) represent the amounts 
by which the respective treatment means have to be corrected 
in order to put them on an equal age basis as determined by 
regression. The corrected mean yield will be given by 
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The complete analysis of covariance proves that the apparent 
superiority of the mean yield of the guinea grass over that of the 
elephant grass can be largely attributed to the difference in 
the mean age of the crops recorded. When the mean yields are 
adjusted to an equal age basis by means of the regression of yield 
on age, the guinea grass shows no significant increase in yield 
over the elephant grass. The analysis of covariance has provided 
an accurate interpretation of data wdiich otherwise might have 
been responsible for rather erroneous conclusions. This is a 
very simple example of the covariance technique, but the applica- 
tion of the same principles to more complex data will be 
found elaborated in Chap. IX in connection with uniformity 
trials in field experimentation. 


TEST FOR LINEARITY OF REGRESSION LINE 


In all the preceding examples of the application of the regres- 
sion principle to experimental data, it has been assumed that 
the regression is linear. While this is undoubtedly the form 
most widely applicable in agricultural research, it is by no inean>s 
the only form that the regression can take, as the line of best 
fit to the plotted points on the regression graph may be in the 
nature of some definite curve rather than a straight line. For 
correct statistical evaluation, it may therefore be important to 
be able to recognize those occasions in which the linear regression 
function will not provide an accurate interpretation of the 
recorded data, and this can be determined by carrying out a 
relatively simple test of the straiglitness of the regression line. 

In most problems involving the regression function, there will 
be several valuCvS of the dependent factor recorded against each 
value of the independent factor. The variates of the dependent 
factor can therefore be grouped in arrays in accordance wdth the 
class of the independent factor with which they are associated. 
The following data have been extracted from Table 47, which is a 
correlation table for the yield of oats x recorded against the 
number of culms y. The number of variates in each array is 
the number of individuals or frequency in each row of the cor- 
relation table, i.c., in each culm class, 

^ The total sum of squares of the yield data— S.S. x—h the 
of the sums of squares between arrays and within 




Sigtl 
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Table 53. — Yield of GEAih' and Number of Culms for 100 Oat Plants 


DiriRct calculation, of dtiviat.ioris from regression 


S.S. linear regression 




Data from' Table 47' 

Culm 

lio. or 

array 

mean 

(wv) 

No. of 
variates 
in array 
(fre- 
quency) 
(/) 

Total ' 
yield of i 
array 

(n) 

1 

Mean 
yield for 
eacb 
array, 

, Ta 

y 
(m,r) 

■' 2 ■ 

13 

■ 31 

2,39 

3 

33 

109 

3.31 

4 

42 ■' 

190 

4, S3 

.5 

S' 

43 

5.37 

6 

3 

17 

5.67 

7 

' 

■ 8 

8.00 

Total. . . 

iob 

' 398 

1 



Predicted 

i 



mean 




yiebl for 
' , each 1 

— Xa j 

. ' i 


f(mx r- XaVr - 

1.11 1 
iX a) 



I 

2.38 

O.Ol 

0.0001 

0.0013 

3.39 

-o.os. 

0.0064 

0.2132 

4.40 

0,,V3 

0.0160 

0.7098 

5.41 

.-0,04 

0.0016 

0.0128 

6.42 

-0.75' 

0.5625 

1 .6875 

1 7,43 

0.57 , 

0.3240 

0,3249 


S,S. Deviations from regression. « 2,9475 


S.S. .T ^ 171.96 
S.S. tf =- 96,36 
S.P. xy =» 97.16 


Mean yield (Mx) “ 3. OS 

Mean culm no, ™ 3.5S 

bxn « 4 - 1.01 


calculated from the above data, allowance being made for the 
different number of variates in the sex>arate arrays. 


S,S, between arrays — S 


(?) - 

^2 ^ 109 

13 


grand totaP 


33 ‘ 42 ‘ 8 ‘ 3 ; 1 100 

— 100.02, having 5 degrees of freedom 

S.S. within arrays - 171.96 — 100.92 

ie.j error — 71.04, with 99 — 5 degrees of freedom 

The sum of squai'es between arrays is itself a complex compo- 
nent in that it represents the aggregate effect of the regression of 
yield on the number of culms and the deviations from this 
regression. 

m\xyr 

97 16^ 

“ 97.97, With 1 degree of freedom 

Deviations from regression « 100.92 — 97.97 

= 2.95, with 4 degrees of freedom 
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The within-array variance provides a fair estimate of the 
uncontrollable sampling errors of the yield data witii the influence 
of the independent factor eliminated. If the regression is truly 
linear, the variance of the deviations from the regression should 
not differ significantly from that of error. This may be tested 
in the usual way, by calculating F or z. The full analysis is 
appended. 

Table 54. — Analysis op Variance of Yield Data x 


Degrees of 
freedom 


Variance 




Total 171.96 

Between arrays: 

Linear regression. 97.97 

Deviations from regression 2.95 

Within arrays or error 71.04 





The variance of the deviations from regression is obviously 
not significantly different from that of error, proving that it may 
safely be assumed that the regression of yield on number of culms 
is linear in form. 

The similarity between the covariance technique and this 
test for the straightness of the regression line is fairly obvious, 
especially if the arrays are regarded as the equivalent of the treat- 
ment or variety grouping of the covariance table. It is possible 
to calculate the sum of squares of the deviations from regression 
directly, and this has been done in the second half of Table 53, 
as.it may help the student to a clearer understanding of the exact 
significance of this component of the analysis. The deviations 
component represents the sum of squares of the deviations of 
the means of arrays from the regression line, due weight being 
given to the number of variates from which each mean was 
evaluated. Any one deviation from regression is the difference 
between the recorded array mean and the expected value Z, as 
determined by regression. From the general regression equation, 

X = Mx + bxy(y — My) 

Therefore, substituting the appropriate symbols from Table 63, 


A?/ '£v 


.4 : 1 '’ 
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For the first line of the table, 

3.98 + 1.01(2,-- 3.68) -=2.38 ' 

In the same way, the values of Xa quoted for the other lines were 
determined. The deviations from the recorded array means are 
next calculated and squared. As each represents the deviation of 
a mean of / variates, the squares have to be multiplie-d by the 
appropriate value of / for the array, giving the figures recorded 
in the final column of Table 53. This column totals to 2.95 
approximately, the value already obtained by the indirc^ct method 
of computing the deviations from regression. The direct method 
not only is a useful clieck on the arithmetic^, but may also demon- 
strate whicli arrays or culm classes are chiefly responsible for 
nonlinearity in problems in which the regression is a curve. 

Regression equations which accuratelj?' define curved regresBion 
lines can be derived, but these are of little practical importance 
in agricultural research. Even the 'partial regremon equations 
which express the regression of the dependent factor on several 
correlated factors concurrently are also of only limited applica- 
tion. These aspects of applied statistics are beyond the scope 
of an elementary book on the subject. 

It is hoped that the examples worked out in this chapter are 
adequate to ensure an understanding, not only of the affinity 
between the correlation and regression concepts, but also of their 
essential differences. Both are concerned with the simiiltaneoUvS 
variation occurring in two or more variables. The former 
measures the intensity of association between the correlated 
variables as a whole, and the correlation coefficient is a pure 
number independent of the magnitude of the units in which the 
variates are measured. Regression values, on the other hand, 
reflect the units of measurement and deal essentially with the 
relationship between mean values predicting the number of units 
that the dependent variable may be expected to change on the 
average for a change of any specified number of units of the inde- 
pendent factor. Correlation may therefore be said to measure 
the relative effect and regression 
variable on another. 


CHAPTER VII 
FIELD EXPERIMENTS 
INTRODUCTION 

Recent advances in statistical method have practically 
revolutionized the technique of field experimentation, so that 
today this branch of research has become an exact science in 
comparison with its previous rather empirical status. Improve- 
ment has taken place simultaneously (a) in the field technique 
and (b) in the final evaluation of the data in the laboratory. 
While this book is more directly concerned with the latter 
aspect, the two are interdependent to ^uch an extent that 
some discussion of the former is essential to a reasonable under- 
standing of the latter. It has therefore been thought advisable 
to include the following general resume of the principles and prac- 
tice of field experiments before proceeding to an elaboration 
of the statistical treatment qf the data. 

, It has long been recognized that there is no easy road to 
. success in agricultural experimentation. Every problem tackled 
I is, of necessity, a complex one on account of the -following: 

a. The number of interacting factors that has got to be 
considered in summing up results. 

5. Soil heterogeneity, 
c. Seasonal heterogeneity. 

Field trials are not generally of the nature of fundamental 
research but are normally planned essentially from the utility 
viewpoint as a means of improving cultural practice in one 
direction or another. The ultimate measure of improvemen t 
i s the resultant net prosper • 

as yield, quality, hardi- 
manurial requirements, market demand, etc., all of which 
due consideratioh^ln^eValuating results. ^ -With 
of a few of the more primitive colonies, th^ 
cultivation pracliced today is relatively iblgH, 
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any advance on the existing methods is not likely to be of the 
perfectly obvious 100 per cent class but rather of the form 
of a 5 to 10 per cent increment. CT hus^ even jj. one. , particular 
treatmen t is better than anot hei%^treatm ent difSillSS. 

is relativel y s light, anti in order to o bta in satisfactory proof of 
this, trials have to be carefully plmmed ahcl acciu^J^ly 
e xecuted . ) It is essential to*gi\"e the various treatoeiS '^tested 
as nearly as possible similar conditions. As we sliall see later, 
in field experiments there are certain influential factors over 
which man has little or no control,, <hnd this makes it even more 
necessary to ensure equality iiV'those others over which full 
command can be exercised. Assuming that the original design of 
the experiment is techni(ially *sound, then accuracy of execution 
as regards such practical details as plot size, plant populatidn, 
cultivation, harvesting, units of measurement, developmental 
studies, etc., is the sine qua non of successful experimentation. 
Such accuracy can be guaranteed only where skilled supervision 
and labor are available and, in consequence, the field experiment 
is a relatively expensive form of research. Moreover, the results 
from any one experiment are of limited application and hold good 
only for the. particular soil and the particular season in which 
it w’^as located. To establish the truth of any general law, a 
number of separate experiments in Various soil types and over 
several seasons would have to be carried out and the aggregate 
data used to prove any particular hypothesis. Field experim^ts 
^lould therefore be the final stage in the solution Sf any '^ven 
agricultural problem and should be resorted to only (after the 
simpler and less expensive methods of eliminating any obviously 
unsatisfactory practices have been utilized. For example, in 
plant breeding, a variety trial should only bo used as a test of a 
limited number of the best strains stirviving after so many jeers' 
rigid sel^tiOfU from pi:obal)Iy ^^veral hundred ...oripiiEl types. 
Similarl;^ exacting field trials as a step toward improvement in a 
primitive agricultural community, where the general principles 
of good cultivation are continually flouted, would definitely 
be out of place. Under such conditionsi^j^the obvious line; of 
advtoce is the establishment of a higher standard of field practice 
and a, more intelligent appreciation of the elementary law^ of 
cro|?' 'growth, ' .The 'x^ue;;of ■ any’ experiment must, be" gaud'll, 
froin^ th#t}osribk increa^ in national crop output, lid in each. 
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problem, the simpler methods of achieving any given objective 
should be explored before expensive field trials are inaugiiratedv 
The extremely variable nature of both soil and season forms 
an unavoidable obstacle to the easy solution of problems by field 
experimentation. I Soils vary in fertility not only from acre to 
acre but even from foot to foot in any one field or plot/ Thi^ 
makes it impossible to produce identical soil conditions for the 
various treatments, and numerous uniformity trials effectively 
illustrate the magnitude and ubiquity of the variation in yield 
arising solely from soil fertility differences. An apparent 
increment in favor of any one treatment may be entirely due to 
the fact that the plots of that particular treatment happen to 
be located on relatively fertile soil pockets, and the increment 
may have little or no relationship with the true potential yield 
, values of the treatment. S.Similarly, with season, the prevailing 
climatic conditions in any one year may unduly favor one or two 
particular treatments at the expense of the remainder, when, 
actually, some slight change in climate might be sufficient to 
cause a radical change in the relative merits of the treatments 
tested. From experiments with 14 varieties of wheat. over a 
I period of 9 years, Engledow and Yule have demonstrated that, 
as a direct consequence of seasonal variation alone, varietal 
. yields may fluctuate over a range of of their mean value. 
Therefore, where the change in weather is at all marked, it is by 
no means impossible for the conclusions of one season’s work to be 
practically reversed in the next, This brief discussion of the 
various problems with which the field experimentalist has to 
contend should be sufficient to prove that, as Engledow aptly 
quotes, Alice soon came to the conclusion that it was a very 
difficult game indeed.” 

It is now necessary to examine the precautions that may 
be taken in order to offset, to some extent, the effects of the variouvS 
environmental agencies at work and to arrive at an accurate 
appreciation of the relative merits of the various characters or 
treatments that it is desired to compare. 
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cultural technique, and accurate grading and evaluation of the 
produce. The chances of success are very much greater wlien 
a specialiiaed experience of the crop has already been acquired. 
Where this experience is entirely lacking, e.g,, when a crop is 
first introduced into a new locality, large-scale experiments 
should not be attempted, as the field practice may be so artificial 
as to provide no fair test of the various factors undf3r observation. 
The yield data represent only a part of the vnhm of any experi- 
ment and should be augmented by regular field notes, recording 
the general progress of the crop and tlie more obvious differences 
between treatments at any particular stage of growth. Such 
notes can only be of critical value when the observer has the 
necessary basic experience of the crop. 

Experimental Site.— The area of land selected for the experi- 
ment should conform to the general soil and environmental condi-f 
tions under which it is intended to grow the crop commercially^ " 
To quote an extreme example, field experiments with sugar cane 
in England would be obviously of no economic value.^ Crop 
trials in unsuitable soils or in an abnormal environment may 
give results of a certain academic interest but are liable to be 
wrongly interpreted and misapplied by the practical farmer, 
especially if they are of a spectacular nature. 

It is of the utmost importance to select the most uniform" 
piece of ground available in order to minimize the effects of soil 
fertility differences between plots. A fair estimate of uni-^ 
formity may be obtained from inspection of the previous crop, 
especially if such observations are supported by the experience 
of the resident farmer. Soil pits dug at frequent intervals are 
also helpful in determining whether the soil and subsoil are 
reasonably homogeneous or not. The importance of this ques- 
tion of initial soil uniformity cannot be too strongly emphasized. 
The theory that the improved technique used in field experiments 
today has made it immaterial whether the land is uniform or not 
is entirely false. Although with modern plot arrangements, the 
effects of soil heterogeneity can certainly be reduced in the 
analysis of the data, they cannot by any means be eradicated, 
and? the more uniform the experimental site, the greater are the 
chances of obtaining a true evaluation of results. 

Specification of Problem. — The nature of the problem should 
be exactly specified before any experimental plan is drawn up. 
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The choice of the particular treatments to be tested is especially 
important, as a relatively slight difference in the range covered, 
in the quantities of fertilker applied, may make all the differ- 
ence between conclusive and inconclusive results. This pre- 
supposes an estimate of the inagnitude of the difference between 
treatments that is likely to be obtained. The influence on the 
results of uncontrollable environmental factors makes* the proof 
of small differences of the 2 to 3 per cent class of little practical 
significance. In a new line of research, it is advisable to select 
a range of treatments that theoretically will be bound to show 
relatively large treatment differences and then gradually to 
modify this range in succeeding experimerits so as to bracket 
the optimum treatment. The advantages of complex experi- 
ments, where several different series of characters are included 
, in a single large trial, have already been discussed (page 66). 
Complex experiments require more expert supervision, careful 
recording, and a valid statistical interpretation of the data. The 
amount of complexity advisable will depend entirely on the 
; ] experience of the staff in charge and on the facilities available 
in the way of labor, funds, and technical equipment. A simple 
: ;,4 experiment efficiently consummated is much to be preferred to a 
^ complex one in which the results are of doubtful accuracy because 
of possible errors of execution or interpretation. 

' ' The need for a high standard of accuracy in the field practice * 

so as to give each plot as nearly as possible identical environ- 
mentar conditions in the way of plot size, cultural attention, 
plant population, grading of produce, measurement of yields, 
etc., has already been mentioned. All these details should be 
specified whefi the experimental plan is first drawn up. 

Vm ' any particular treatment, a single large 

‘ if 'it- is ■■ several acres- in extent, cannot give' yield'i^lf 

ay value for comparative purposes. The experimental area 
' should be divided into a number of similar plots, and so ifiany 
I , . . plots allocated at random to each treatment: The greater the 

‘5^ - fifunber .of replicatiqns of any one series, the greater are the 
teces of obtaining an accurate result. There should be^sujffi- 
^repUcaiibna to ensure a fair measure bl,the mean and the 
ard deviation. It is not generally wise to reduce the num- ^ 
i^e|3»Hcations of any one series below 4, and 6 .to || 

the statistic^ interpre^oh ol' the . 

"r V - ' V.l' t : 
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results, the analysis of variance technique vill usually be adopted, 
so that the standard deviation, on which the statistical compari- 
son of mean differences is based, will be the square root of the 
error variance. It is advisable to plan the experiment so as to 
yield an error variance based on not leM than 10— and preferably 
on 20 or more — degrees of freedom. 

Type of Plot.— No particular size or shape of plot can be 
described as best in all circumstances. For a given number 
of replications, the larger the plot up to acre, the more 
accurate are the yield data likely to be. In plots above this 
size, the increase in soil heterogeneity within the plot will 
generally more than ’’offset any advantage derived from increasing 
the plot area. Where the land or the other faciiitiejs are limited, 
a large number of small plots is generally to be preferred to a few , 
large ones. A good average size for general utility purposes 
is 34o ^cre. 

The shape of the plot may be made anything from square or 
rectangular ib a long narrow strip. The dimensions should be 
chosen so as to give a correct field layout and at the same time 
utilize most effectively the experimental site chosen. Where 
border effects are likely to occur/ ie., where the crop in one treat- 
ment is likely to interfere with the proper growth of crop 
at the edge of the adjacent plot of a second treatment, a non- 
experimental border of sufficient width must be left round each 
plot to ensure that these interference effects will not be reproduced 
in the 3 deld data. This border is cultivated in exactly the same 
way as the plot to which it belongs, but the crop it carries is cut 
out before the experiment is harvested, leaving, for measurement, 
an effective plot unit equivalent to the area within the border, 
lii this connection, it should be noted that the square plot is the 
having the smallest perimeter. 

• I^epetition.— The conclusions from any single experimant are 
"only valid for the particular season and the particular soil in 
which the experiment was located. This makes it rxecessary to 
repeat the experiment over several years and in various soil types,, 
so m to Esoertain the exact range of environmental condition^ 
for which, the results can be, stated to hold good. Ihe standard , 
test’ of .'^^gnificanca is based on chances of 1 in 20,^ so that th^., 
retically,'’'oyer a large number of experiments there is' a distinct' 
posAMitythat a few of the conclusions will be^^^ourate* 
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is an added reason why repeat experiments are necessary to 
supply adequate proof of the accuracy of any result of practical 
significance. The scientist cannot afford to make serious errors 
in his recommendations. Such mistakes in the past have some- 
times led to a loss of confidence in his work by the very com- 
munity which his researches are intended to benefit. 

. Arrangement of Plots.— The plots must be arranged in the field 
ill a manner that will render possible a valid statistical interpreta- 
tion of the yield data. Statistical treatment is essential, if 
real differences between treatments are to be separated from 
purely fortuitous ones resulting from soil heterogeneity 6? other 
uncontrollable external agency. Statistical significg^fce is based 
on the assumption that the estimates of the mea-^s^'and standard 
deviations obtained from the data approximate^o the true values 
that would have been obtained from an infinitely large number 
of plot replicates, i.c., from the whole population. This makes* 
it essential that the location of the plots of any one treatment 
should be a random one. On the other hand, it is known that 
there tends to be a close correlation between the soil fertility 
of adjacent plots, and in consequence, there is a much greatbr 
chance of demonstrating a real difference between two treatments 
A and B if they are located on contiguous plots than if they are 
widely separated. There are various standard layouts which 
satisfy both these requirements. 

Fisher^s diagram in the Statistical Laboratory at Rothamsted 
effectively summarizes the principles from which modern experi- 
mental methods have been evolyed. 


Replication 


Random distribution 


Local control 


;^|i^dity of estimate of etror Diminution of error 

diagram in statisticaUaboratory, Eothamated. . 
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It is proposed now to describe a few of the standard designs 
tised in field experiments and give examples to demonstrate 
an appropriate statistical analysis of the data in each case* It 
will be assumed that the student is familiar with the basic 
formulas and arithmetical procedure as described in Chaps* I and 
II in connection with the analysis of variance. In field experi- 
ments, the statistical principles remain unaltered, and the yield 
data are evaluated by the particular form of the analysis of 
variance nppropriate to the expcriincntal design. 

Example 26. Varietal Test of Wheat — In this experiment 
three varieties of wlieat were sown on plots of J 4 o acre -each. 
Six replications of each variety were used making 18 plots in 
all. The location of the six plots of any one variety over the 
site of the experiment is pro tempore assumed to be a random 
one. v 

Tajble 55. — Yield op Grain in Kilograms per J|o Acre Plot 

. ' ■ ' . ' ' ■ -vVariety; • ■ 

V ,.... ' Serial no. Total across 


Variety total 


90 •'Grand total 216 


The statistical analysis is similar to that adopted in Example 
7, the variable-squared method of calculation being used. 


Total S.S. ^ 8 ^ + 14=2 + 12=2 + • “ ‘ • 122 + 132 + 132 _ fg, ^ 1^4 
Betweemvanety S.S. « — == 9 ^^ 

This last component of: analysis may be calculated inde' 

pendently as a check dh the arithmetic. It will be, 
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JABijiiJ 56 . — The Analysis of Vaeiance 


Degrees of . 
freedom P. 


Tola! 

. Between ;;¥aneti6s. .. 

Within vanetinw, ue.j error 


The reading of F frpoi the. tablejor jii = 2, 

P = 0.01, is 0.3, (5, proving that, in this experiment," the 
variance is significantly greater than that for error. A di 
lictween variidy totals greater- than* 

V'pTx'Fxg X 2.131 = 18JL9 
is significant. The variety totals are 


'^1 would goi 
iBYagtigatioii itJ 

I 

— Bw A. F 


so that variety C has5 yielded more than ei^er A or B. 

RASTDOMIZED BhOCjfe LAyOTTT 

There are various alternatives to. the purely randomized 
plot layout d&scnbed above. These alternatives generally 
lead to a valid reduction in the estimate of the error variance, as 
they are planned so as to make use of' the fact that contiguous 
plots tend to be positively correlated in ^il fertility. The ran- 

®™plest of these controlled arrange- 
t»«i?at8. The land selected for the experiment is divided up into 
1 number of sections w bbeks of •simiiar dimensions. The 

of replicatioD a 

|vp Ijots ofi'eeach fifeatnarntTUSe wiU be five blocks, fec- 
Ja^lar or square blocka m probably the be^ in orderitoinak# 

.Thiswiffreducdlfsi^ 

Biore or-itesf. 


nonq^pen. 





FIELD EXPERIMENTS 


tjjo primary consideration. Each block should be divided 
into the same number of plots of a given sho and shape. The 
number of iri'fle^'blbck must be equal to the number of 
different treatments to be com pared, so that if ^ffour varieties 
are under test, there will be four plots in each block. A single 
plot of each ti^eatment must bo included in each block. Thus 
every series is represented once in every block and, to this extent, 
the arrangement of tiie plots is a controlled one. The allocation 
of the treatments to the particular plots within a block should 
bo a ptnely random one, determined by drawing lots or by other 
ehanc(^ method. iThis randomization of treatments within each 
blo<».k is absolutely essential if the ordinary statistical tests of 
significance are to be validly applied. The total number of plots 
in the experiment is the number of treatments multiplied by 
the number of replications, i.e., ,the number of plots hr a block 
multiplied by the number of blocks. 

Example 27.— It is possible to use the data from the wheat 
variety trial (Table 55) to exemplify the statistical tachniquh 
applicable to a randomized block design. Actually, in this^ 
experiment, the arrangement of the 18 plots was not completely 
randomized, but the plots were grouped together in groups of 
three in juxtaposition, ^ving six similar blocks. Each treatmeht 
appeared once in each block, the arrangement of the three treat- 
ments within any one block being "a random one. The serial 
numbers in the first column of Table 55 represent the six blocks 
and the yields of the three plots in^ e|ch of these blocks are 
entered in the same line. Thus, the final column of Table 55 
records the block totals. The data in this form are therefore 
representative of the yields from a randomized block la:3?t)ut 
made up of six blocks and three varietias. 

The dispersion of these 18 variates, the total su]^;;,of 
sqp^es, is composite in character and is the result of ' ' 

^ Differences between varieties. ' 

5. Differences in soil fertility between blocks. — ; 

^/'d/'Unavoid^able variation l^tween similar variates, _ ie., error*' 
.‘variance. •' ’ .i,.- 

'"'‘‘'7 . • ’ - 77 ; ■■ ' hW ;, 

r Thcf^ eompo^ent 4imi o#, aqtlafes can be assessed in the usual ■ 
way, and the appropriate number of degrees of frOedoir^allocat;Od 
to The total ap4^7^ri|&ty sums of squares have 

Vi; ' , 



"'1 would ^ 
investigation 

iiiiSIlll^Silll 


IWsHiliiiliiiiiiifi^^ 

lilSSiSlliiiiSlSRiSSiiiiliSiKilillSliilft;li^^ 


Comparing this analysis with that given in Table 56, the most 
obt^iouB difference is the very much smaller error variance here, 
showing that the variation in fertility between blocks is responsi- 
ble for a relatively large share of the total dispersion. In fact, 
if the F test is used to compare the block variance with that for 
error, the difference between them will be found to be definitely 
significant. Even though the degrees of freedom of error have 
, - been reduced from 15 to 10, the elimination of the block compo- 
nent of the dispersion has greatly increased the chances of a 
positive F test. The error variance in Table 57 really measures 
the unavoidable variation between plots in the same block— 
' ; the aggregate within-block variation — after due allowance has 
been made for varietal differences. It should now be clear why 
it is important to have the individual blocks as compact as 
possible. The elimination of the block sum of squares from the 
; : ' , estimate of error will not usually be advantageous unless there IkS a 
reasonable similarity in environmental conditions between the 
plots ill any one block, and this is likely to occur only where the 

result of the inclusion of a large number of treatment comparisons 
in a single experiment, tend to annul the benefit that might be 


S.B. 

Degrees of 
freedom 

Variance 


184 

17 '■ 

46.i0 ' 


■ 93. 

■■ 2 . 

r''«— 24.48 

72 

19 

■5 

10 

14 '40 
i.eb 

i./ 
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beeli worked out in Example 26. 

332 + 422 4. 


Block S.S. 


72 


'3" ■' 18 

TABiiB r>7 .— Analysis of Vaeiancb of Eandomiized Block Layout 


Factor 


Variety 

Biocks. 

Error,. 


1 
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On this basis, the treatments can be correctly graded on the 
following order: C, A, B, The block layout has reduced the 
estimate of error sufficiently to prove that the difference between 
the *4 and B varieties, which was nonsignificant on the original ' ' ' 
analysis, is actually a real difference in favor of A, 

In most agricultural experiments, it is advisable to express the 
final statement of results in the units of measurement normally 
adopted commercially. This is most easily effected by calculat- 
ing a single conversion factor by which the treatment totals and 
standard errors will have to be multiplied. If one assumes that a 
bushel of wheat weighs 62 pounds, the factor required to convert 
the wheat variety totals into bushels per acre will be 


. .12 X — 0 242 

6 ^ 62 ' 

The results can now be recorded as 

Variety A 16.7 ± 0.819 bu. per acre 

Variety B 13.8 ± 0.819 bii. per acre 

Variety C 21.8 ± 0.819 bu. per acre 

and a clear statement of the final conclusions should follow. 

As a significant difference between .means is approximately one 
greater than three times the standard error, it is possible for 
the reader to apply a simple test of the accuracy of the deductions. 

Using the variable-squared method of computation, the follow- 
ing formulas summarize the calculations involved in the analysis 
of variance of data obtained from a randomized block experiment. 
Let X = yield of any plot. 

n — number of blocks, number of replications of 
, each treatment. 

p — number of treatment comparisons, Lc., number of 
plots in any one block. 

T - grand total of all plot yields. 

Tt — total yield of the n plots of any treatment. 

Th — total yield of the p plots in any block. 

Then, 

Total S.S. — — — rr — with np — 1 degrees of freedom 

Treatment S.S. » — —rr- with, p ~ 1 degrees of freedom 

ll 7h ^ P 









'i/K-'i 









i®il 
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ymg rpi 

Block S.S. = with n - 1 degrees of freedom 

p n Xp 

Error S.S. = total S.S. - (treatment S.S. + block S.S.) with 

(re — l)(p — 1) degrees of freedom 

If llie as.siimed-racan method of computation is adopted, the 
same formulas apply if x, T, Th, Ti, represent the values recorded 
in the table of deviations instead of the actual plot yield.s and 
totals. 

LATIN SQUARE 

In this layout, the number of replications Is made equal to 
the number of different treatments included in the experiment. 
If n is the number of replications (or treatments), then the total 
number of plots in the square will be The plots are arranged 
in a single large block so as to give the same-number in line 
counting across or along the field, i.e., the fmmber of rows of 
plots is made the same as the number of columns.J The dimen- 
sions of the individual plots may foe anything from sqrfeire tp 
relatively long narrow strips, and the shape of the Latin square, 
will be square or rectangular accordingly. ?rhe term “square” 
applied ,^o this type of experiment is therefore used in the 
conventional sense of the word, ‘ 

In the distribution of the treatments over the' plots of the, 
experiment, .each treatment should appear ohoe in ^h row apd; 
opoe in each column, but the allocation of the treatipents withfl 
the rowfe kad .wlumns is other#^ at random. Itij^order to 
■hnsuw of or analysis of 

variahtoof the data, it to e^ect a ccuriict rhndomiza- 

tion of the timtm$nta,'s|^n tfop yo'W.and <»IuiopSj.^aD tfei.t the 
ultinmte p|pt sample all 

po^ble, squamvof £#i>ecially,.ydtB' .-tege 
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A B C D'‘ B C D A B A C D B C A D 

B C D A C D A B C BD A A B D C 

C D A B A B C D A DB C C D B A 

D A B G D A B C D C A B D A CB 

Stfinckird Eeshiifimg ReshufHing Reshuffling 

square of rows in of <!ohiiniis of the -4 , B, 

the orih^r 2, ^ in the order C% D treat- 

h 1, 4, 2, 3 ments on 

■ keted .■ the, basis;; of ■ ■■ 

' randont C\ Bj Af D 

The advanf^gg^of the Latin-sqtiare layout lies in the fact that 
the jilfflE^ean be grouped into n similar blocks in two distinct 
ways: (a) in aecordunce with tlie rows and (b) in accordance 
with tl.H^ columns* Each of thcvse components can be evaluated 
in the analysis of variance and^ in consequence^ it is possible to 
elirninaie from J4ie estimate of error the effects of soil fertility 
Ganges in twb directions at right angles. As a control of the 
soijk heterogeneity factor, the Latin square will generally be 
found to be an improvement on the randomised block layout. 
It has th e disadvantage that it is much less flexible in character., 
For of replications is limited by the number 

of treatment comparisons, and where only three or four treat- 
men ts are to be compared, the total number of plots in a single, 
square will be only 9 or 16, respectively, which will not provi^^' 
the minimum number of degrees of freedom advisable in orddr 
to ensure ,a fair estimate of the error variance*^ 

The analysis of variance of a Latin-square experiment is 
fundamentally the same as that of the randomized block layout. 
The only modification is that the block sum of squares is virtually 
duplicated in the rows and in the columns, and the sum of squares 
attributed to each of these components separately is subtracted 
from the total, sum of squares in estimating the error. 

Example Statistical Analysis of a 6 X 5 Latin Square for 
Data Taken from with Sugar Cane.* — 

The five treatments were as follows: , - ' ■ _ 

A No'toimure " - ■ ;r,', ^ ;■ 

B - Cpmpi;ete%qrgainc at raie of 90 Ib.'N, 375 lb. P^jOs, and 60 lb. 

■ KasO' per aitiSiff;.,; ^ , ■■■ . ■ -'v . , - 

P;^t acre” ■: 7 / S;. T:; 




D 20 tons famryard manure per acre 
E 30 tons farmyard manure per acre 

OF FhA^T C'AKni^ lN . HALP-HtTNimBDWEIGHT^^ FEa Jio- 

Fx.or abound an Assumed ■ Mean of 40 Halp-cwt., 



"■ ":t. '" -'' how' 

■■ ^ . ■ ■ ColuiHu " " 

■ , ' Row ' 

'totalis 

Treatment 

totals 

:.: 'I ; 

■/"■■"."I' 

"i;|... 1 "HI, " 

: "4’"1 ■ 

, 

" 'V"' 

": . ■ ■' ■■ 

. .j., 

i 

1 12 

■ ■ ; 

■ /B 

I) 

■ 4'’ .' 

H" 
a ■ 

■ + 
B - 

., 1 

-17 ■■ ■ 

+ 

A -,34 

t ,t :n : 

1 IJ 

1; ,4 

" B' ■ 

■ A 

W 

'■ B ■ 
"";,■■■ 4 

,e ■ 

3 

4- 3 

B +23 

■"■ "iir 

■■ B 

9 

A 

.7'. . 

a " 

■' 0 

. . 1) '■ 

■ ■"! 

B 

7 

j +10 

C -U 

IV 

. 0, . ■ 
s. . 

. .D- . 

2, ■ 

- 'B ■ 

,■ , 7 

1 : B 

,5 

'A • 
5, , 

1 ■ -I 

i + 2 

0-4 

4/ 

V 

s’ ■ 

^7 

1 

K e' . 

:■ 3''",, 

■"'" B 
\_' (}_ 

L A’ ,1 

D 

3 ■" 

i 

i + 7 1 

:E +31 



1 „■■, ' ■.■■ 

i'.. 

4“3 ':! 

1 Grari:t total 

1 '■ ' + 5'1 
i:.'- '■ :'■■ .■! 



The treatment, row, and column totals are all tabulated, 
and i.he sum of squares belonging to each of these components can 
be calculated as usual. For exainple, ^ 


Et)w'S.S. 


89.2 


:'!There':are.; fiye ■ sepamteVrowsi, so: that, the: number "' of ""degrees.,, of ’ 
..■■■freedom of ' tte".,rq'w',.'"S.S* ■" fe..4; "■th6’;'same:'app,ieB'.'';fqr and:-' 

•.."treatmeBls,,'' ■■ 

..■ ■■TABLiS;'""59.—AN:fr^ . y.A"EiANCE'',.:..v>::.: " 



S.S. 

Degrt?es of 
freedom 

"'"Variance;; 

F 


800.0 



'0Upr:'[: ■ 


•rMm 

4 




:rmm 

4 






■'ImAP'a 




||i||||f: 
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I'he treatment variance is obviously sig nific ant. The stand- 
ard error of any treatment total is \/l2.i X 5 = 7.778. To 
convert this standard error or the corresponding treatment 
totals into tons per acre, it is necessary to multiply by the 

factor == The standard error expressed in tons per 

'3 

acre — 7.778 X ^ — 1.167. The treatment totals given in 

Table o8 represent tlie total deviation of five plots round an 
assuiTied mean of 40 lialf-hundredweights, so that, before 
multiplying by the conversion factor for tons per acre, these 
totals must be expressed in absolute units of half-hundredweights 
per plot by adding to each 5 X 40 or 200 half-hundredweights. 
Tlie mean yield of treatment A is therefore ; 

(-34 + 200) Jc = +7.778 X A = 24.90 + 1.167; 


24.90 + 1.167; 


the others are treated similarly. 


Treatment 1 


Tons per acre 

A 

No manure 

24,90 i 1,107 

B 

Complete inorganic 

33.45 ± 1.167 

■ ,.i 

10 tons farmyard manure 

28.35 ±1.167 


20 tons farmyaj*<l manure 

29.40 ±1.167 

E ' . 

30 tons farmyard manure 

34.65 ± 1.167 


■ t , . 



Using three times the standard error,* i.e,, 3.5 tons per acre, 
as the measure of a significant difference, it is obvious that the 
most effective treatments are the complete inorganic mixture 
and the heaviest application of 'farmyard manure. The 10«ton 
dressing of farmyard manure Just fails to show a significant 
increase over the control, which is, however, definitely worse 
than the remaining three treatments. The five treatments can 
therefore, be graded as follows: 


Poor 

Average. , 
Good 


No Manure 

10 and 20 ton dressings of farmyard manure 
Complete inorganic and 30 tons farmyard manure 


’‘‘The exact critical difference is X 5 X 2 X 2.179 X ^ - 3.59 
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No manure 

IH lb. superphosphate per tree 
3 lb. superphosphate per tree 
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The calculations involved in an analysis of variance of a Latin- 
square experiment arc summarized in the following formulas: 
Let X = yield of any plot. 

n = number of replications of each treatment or number 
of rows or number of columns. 

T grand total of all »* p 

Ti, Tr, Te == total yield of n plots of any treatment, row, and 


Total S.S. - 


■mo ' , - 

— with -- 

7h^ 


n’ees of freedom 


:Treatine!i:S.S. 


sTf 


Error as. 


with n 


with n 


1 degrees of freedom 
1 degrees of freedom 
1 degrees of freedom 


with 

,/_n \ wf - 

XTl ■ 

■'. n : # 

total S.S. — (treatment S.S. + row S.S. + col. S.S.) 

with (n — l)(n 2) degrees of freedom 

Replication of Latin Squares* — In the Latin square, the number / 
of replicati<.ms is limited to the number of treatments tested, s/ 
so that in small experiments the number of degrees of freedom 
on which the error variance is based becomes unduly depleted. 
This difficulty can bo overcome if two or more Latin squares 
are used instead of a vSiiigle one. A separate randomization 
of the treatments must be effected for each square. The sta- 
tistical analysivS beccjmes slightly more ela1)orate in order to 
take into account, the fact that there may be considerable 
vaiiadon ':in.:4he ;average ::sml^-;fertility different;. squares.:;;, 

This is espc?cially true if tluj squares are loc?ated some distance 
apart. In fact the treatments must be regarded as complex 
in nature consisting of actual treatment in combination with 
site, ie., of treatment and Latin square, and tlifeeanteraction of 
;lheie;;twh-'pii^p; should.' ■be'',evaM 
Example 29* Statistical Analysis of a Cacao Manurial 
Experiment Consisting of Three 3x3 Latin Squares*— The 
fertilizers 'Used ware as follows: 
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Table 60.~ 

““LAYpur AND Yield YN 

Pods ■ :pbr Tree - ero.m', H 

5-acee Plots 



'■Sqiia 

i| ■' 

re I 

SqiBire 11 Sqiiari 

3 III 




Row 


Row ' 


Row 




totals- ■ 


totals 


totals 



[C A 

.' cr : 

B. 

A . A 0 

B 



41 

25 15' 

■B1 ' -27 

28 

5 . B8 11 15 

17 

43 


:a 

B C 

4, 

a. 

B BA 

a 



2o; 

52 24 

76^ ' 4 

17 

: 9, . SO . 124 14 

33 

71 


.cr 

A B 1 

E 

A 

■C . \c B 

A^ 



22 

12 2.1 

m 22 


17 4$ ' 22 20 

is; 

B7 

Columir 


$9 eo\ 

Total of 


Total of 


Total of 

total. . . 



I = 55 

4R 

39 11 57 M) 

EB 

III - 


Ihe ealeuiation of the total sum of >squares from the 27 plot 
yields is perfectly straightforward, and amounts to 2,117*0. 
From the table of the treatment totals 

1VT * 1 4 . c. o 98^" + 214® + 202^ 514® ^ 

^ Manunal'treatment ;sr:;;®04.4.;v;:. 

Interaction;:: mariures':X -squares 

In estimating this interaction sum of squares, the totals for 
eachptiK3atihfiht;;;|h:-veach:";aquj(;pi’;;:‘^j 



Treatiuent totals 


Trciatment ' ; ' 

Square 

I 

Square 

II 

Square 

III 

ment totals 

.Va-'-'-: : A- - 

47 

11 

40 

98 

B 

94 

59 

61 

214 


71 

61 

70 

202 

Square, total. . . . . K: 

: -'212 ; ■ 

Tai 

171 

Grand Mai —514 






Row 


s.s.-(! 


8P4-70H-55“ 
'3'” ■ 


2122V , Y58“+30‘+432 1312\ 

T V + V - - V) 

+ (<2i±ni+s7»_I7P^_3,,j, 


Ab there are^ three rows in eacl'i square, there will, be 2 + 2 + 2 
riegrees of freeclom attaehef! to this estimate. In a similar 
manner the column siim of squares can be calculated. It 
amounts to 242.4 


TabIjE 61 .— Analysis of Variance 


^'v :Fa€to 

as. 

Degrees 
of free- 
.doj:a ■ ' 

■ Variance 

Vi log. 
variance 

10 


'2117.0- 

■’.''■26', 



. Mauimai'treatriieatS'. , 

I 904.4 

1 ' ■ 2 ■ . 

■ 452.2..:'/i 

r.'9057 ■ 

-Bepares, v. . ; 

1 '364 • 6 ^ 

i|' .^.,'2.^ ::■. 


1.4515,-.: 

Si'i'uanjH X 





t:.:/teatmerits. , . . 

ir>6.o 

r '■ ■ 

! 39:0 


. V'.':-. 

38H.5 

Iv . 6 ■ ■ 

fj4,7: 

0.0336}s - 0.9237 

Cfiluiruis. ............. 

|..■'-242;4 :■ 

'Iv; '6 

• .40.4, 

> (signifi- 




10.2 

i ■■ . 1 

0.(K}09) cant) 


t;'-'OChe^eoluiEii:iincl;lntemc|iqiivariaiiceB 
on'A-:prqbal>ili%^;^ , The :test shows; the the factors; 

in the analysis to be significantly greater than the error variance. 
The elimination of the soil heterogeneity effects in the evaluation 
of the variance of squares, rows, and columns has been beneficial 

If tbialslor eac^^ 


— - — 
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half of Table 60, are nsetl, nine separate vahies in all. These 

vahtes account for 8 degrees of frcocloiu, of which 4 have already ^ 

been used up in the square and inarmrial treatment sums of 
BC|iiares. This leaves 4 degrees of freedom for the interaction of 
square.^, , ' ; 

V As allowance, has already, been '.made for differences 'in fertility 
between tlie;th.ree ;squares, the row sum of squares must be calcu**- . 
lated ; for: each sqiiore/ separately,^ the' three : separate val^^^ 

andriegmes of freedom-, must be added together. . . 
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proves eoricliisively that there is a marked increase in jdeld as 
a result of the application of a phospliatic manure to the cacao 
trees, and that the double dressing of 3 pounds per tree shows no 
advantage over the single one.' 

■ ■ GENERAJLIZATION OF .RESULTS 

The eoiichision, as derived ' frorn the statistical 

calculations, that l ?,-2 pounds of phosphate is the optimum, rate 
for fertilizing, cacao, ' only- validly .applies to- the particular three 
soils or localities in wliieh the squares were laid out. It is possible 
to use tlie same analysis of variance to achieved a more .■.compre- 
hensive (^on(‘lusioii which, on the average, eon be said to be true 
fur all cacao estates situated in the same general crop zone, and 
not only for sites showing similar soil and environmental factors 
to those actually occurring in the experiment. Let us suppose 
tiiat three t3q3ical cacao estates at widely different centers had 
been selected at random for this experiment and that one square 
had been established on each estate. It follows that the soils of 
the three squares represent a random sample of the different 
soils in which cacao is likely to be planted in the locality. A fair 
estinsate of the error variance likely to occur on this wide variety 
of soils will then be given by the treatment X square interaction, 
and rcKSults which are signifi(*ant, as shown by the comparison 
bet^veeii the treatment variance and that for this interaetion of 
squares X treatments, are valid for the whole cacao crop of the 
locality. 

The required variances taken from Table 01 are as follows: 


, Factor. 

B.S. 

■Degrees.' 
.'■ of fr€‘i3" 
'dom ■ 

Variance 

>2 log„ of 

variance 

;, ■ 

Man u rial treat naait 

M4.4' 



1 .9057 

interaction : 





Squares X treatments 

156.0 



0.6805 

Difference 




: ■ ;:'L2252.:'v 


The z test shows that the difference in the logarithmic column 
i.s significant. A difference between the treatment totals of nine 
plots greater than V39 X 9™>^2 X 2.776 - 73.6 is significant. 





Trciitiijeni; 



Half-hii nd red wei gh is 



Complete inorganic fertilisier 


10 ions farmyard manure 


20 tons farmyard manure 


00 tons farmyard manure 

1' ;■; / V'; 
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This agaiE proves that the lj4'"POinid dressing is the best rate to 
'nse .biitj in addition^ it lias changed the practical significance of 
this result from one of rather limited application to one of general 
validity. ' ' 

A' similar type of statistical treatineiit will ofteii' be of .value 
in the interpretation of the complete data from several years’ 
experiments with' the same inatf^rial ■ In such ■serial experiments^ 
it is exceedingly important to know ' whether- the comilusions are^ 
likely to hold good for all ordinary Hcasonal vagaries, or whether, 
sirirdly speaking, they ordy rtxdly api")ly to the particular types 
of season prevailing in the oxperirnentaJ years. A comparison 
of t'lie treat niient variance vit.h tliat for the trealnumt X season 
iriteraction is tlie most effiM-tivc*. method at jweseni available for 
settling thi^ formtr query. 


GEO0PING OF TREATMENT COMPARISONS 


In experiments in which one of the treatments is in the form 
of a c?ontrol or standard type, it is often instructive to determine 
whether, on tim average, the other treatments can be regarded 
as better or worse tlmn the control If the treatment variance is 
considered as a whole, it may even happen that the individual 
treatment differences are not sufficiently great to give a positive z 
test, whereas the isolation of one particular treatment compari- 
son in the form of control vs* the average effect of the rest may 
indicate some definite response in the yields. The modus 
op^atuU is perfectly simple, being essentially the resolution of 
'ihe:' ■total treatment : warianee; ; tnhoiig ;:'the yarioiis. ^ faetors::;clet6r- 
■mining/dts: 'Vahm.;-; /'.Gonsider:;,' Example 
:trea%nent 'yield'S'^,for;the'^t0pd,of ;five".plptS;are:;as: foffi^ 
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plots plus the variation due to the different types of fertilizer 
used. The totals for the nonmanured and maimi’ed treatments 
are for, 5 and for 20 plots^ respectively, so that the sum' of squares 
for the first eoniponeiit 


no manure' 


Tliorp S.S. type of manure — ■ — ■ 

- 249.35 ■ . 

Total treatment S.S., — 306.25 + ,249.35 

— 555.6 (as originally calculated, see 

Table 59) 

The fertilizers in turn can be divided into two distinct classes*—- 
organics and inorganics — and it is informative to carry the 
analysis a step further and resolve the sum of squares for manures 
into its components. 


S.S. organics m. inorganics 


S.S. quantity of 


farmyard manure 


T.CBLn ; 62 .*“-^An'AnYS'IS ; OF , Vaeiance 


Degrees 

S.S. of free- Variance 
dom 


calcula* 

tion) 


variance 


Manure t?.-?. no manure . . 
Orgarues vs, inorganics . . 
Quant ity of farmyard ma- 
nure 

Error (from Table 59) . . 


**** Significant at the 1 per cent point. It should be noted thatf in writing up experimental 
reaiilts, it is not generally considered necessary to tabulate the calculated values of F or a. 
The customary practice is to mark variances which are significant at the 5 per cent point 
with one star, and those significant at the 1 per cent point with two stars* , ; . 
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After allocating to each ite correct share of the total treatment 
degrees of freedom, the F or z test can be used to test the signifi- 
cance of any of these components of the treatment variance^ 

The first and third components are significant, but there is no 
sigtiificancc in the comparison of organics vs. inorganics. Manur- 
ing of cane, as judged by the average result of the four dressings 
tested, letiih to a definite increase in yield; a heavy application 
of farmyard irmnun^ i^roduees a much bigger response than a 
light one; a, conifilete inorganic fertilizer ])roduced a yield incre- 
ment equivalent to tiiat a heavy dressing of farmyard manure. 

ORTHOGONALITY 

The analysis of variance technique as applied to research 
data is valid only when the experimental design is orthogonal. 
Yates defines orihogomlity as ^Hhat property of the design which 
ensures that the different classes of effects to wdiich the experi- 
mental material is subject shall be capable of direct and separate 
estimation without any entanglement.^^ A fundamental prin- 
ciple of tile orthogonal layout is that any real differences between 
the treatments in one scries should not affect the relative values 
for any of the other series in the experiment. To ensure this, it is 
advisable to use the same mmiber of replicates for each treat- 
ment in any one series. This should effect a balanced and 
orthogonal experimental de>sign. Apart from any question of 
orthogonality, equal niunbers of replicates are desirable in 
order to keep the statistical calculations as simple as possible. 
Ill a simple randomized block experiment with five blocks and 
two treatments, eaeJi treatment occurs once in each block; any 
fertility differmujc bctwemi ihe blocks as a wimle will be reflected 
proporthmaleiy in all tim treatments, so that treatment differ- 
ences are not affected by variation between the blocks. Simi- 
larly, each block contains one pled of every treatment so that 
any difference lietween the blocks is not affected by treatment 
differimces. Treatment and block variances are entirely inde- 
pendent and can be calculated separately. The layout is 
orthogonal, and the analysis of variance technique can be validly 
applied to the data. 

j If, by accident, two plots of one treatment and none of the 
second were included in one of the blocks, the whole balance 
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between treatinents and blocks is upset, as sliown in tlie following 
diagram : ' 


Treat" 

Blocks I 

No. of variates in 


, 'l ' . 

' ' II ■ 

. HI : 

IV 

■ V ; 

each treatment 



■ ■ ' X. 

XX 

X 

X 

i ’ 6 ' 

B \ 

a; 1 

■ " i 

.X 

.J 

X ■ 

X 



Suppose, for example, tliat the soil fertility in block III is 
: distinctly above average, mean yield of treatiiieiit A will be 
a!)Ove and that of treatment B will be below their true potential 
values. The resultant coneliisioii regarding the relative meritB 
of the two treatments might be erroneous. The error variance 
from a randomized, block design is really a measure of the inter- 
action between blocks and treatments. Normally, in a layout 
of the above type, the interaction might be assessed directly from 
the diiEference in the /I — B values for each block. Here, how- 
ever, treatment differences are entangled or confounded with 
the block differences, and the interaction or error variance might 
also be affected by the mistake in block III, The results would 
probably be markedly falsified. The layout is therefore no 
longer orthogonal, and the ordinary analysis of variance technique 
cannot be validly applied. A modified and very much more 
complicated method of statistical analysis, entailing the fitting 
of (constants, would be required. Alternatively, the between- and 
witlun-treatmen t variance could be calculated and tised to compare 
the treatments, taking into account the different number of 
variates in tlie two scries. Although this method is permissible, 
it makers no allowance for the variation in fertility between 
blocks and would ahnost certainly give an unnecessarily high 
esi-imate of error and detract from the precision of the experiment. 

The same criticism of nonorthogonality would be true if the 
above mistake occurred in a factorial experiment in which the 
numbers I to V represented a second treatment series super- 
imposed on the A. and B treatments and if several replications of 
each treatment type were included. The data could still appar- 
ently be analyzed by the ordinary analysis of variance procedure, 
but the nonorthogonal layout might lead to a false estimate not 
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onJy of the trea;tment variances but also of tlie error variance. 
The example cited represents a relatively slight deviation from 
orthogonality, and the results by the ordinary method of analysis 
miglit not be greatly different from the correct values for the 
experiment. This could not be considered as any excuse for 
ktiowingiy applying a faulty statistical technique. The more 
exireine the degree of nonortliogonality, the greater will the 
■ faJsifi(mtiont>f the;ix3su a simple analysis of variance tend to' 

l.)L% Lac'k of re(;ogniiion in the past of the prin(uj)le of orthogonal** 

' ity has led to the incorrect 'Use of . the analysis 'of %wiaiice .tecdi- 
nique ami subsequent erroneoiiH interpretation of experiinental 
(lata. In [danning new expculrnenioS, tlie novice would be well 
advised to adopt, only .standardissed designs in which a correct 
balance botwetui tlie diff^u’cnt components of the analysis of 
' . 'Variance^ is 

IHCOMPLETE EBCOUBS 

In field experiments, it is sometimes impossible, for one 
reasoii or another, to obtain the correct yield data for certain 
plots. Tliis may be the result of loss of the actual yield figures, 
mistakes at harvest, serious damage to isolated plots by vermin 
or flooding, or other external agencies. The experimental 
preciBion is bound to be somewhat impaired, as, of course, the 
original orthogonal design is upset even when only a single variate 
is missing. If an accurate appreciation of the relative treatment 
effects is to be obtained, the statistical technique has to bo 
modified. 

Oonsider first of all the simplest case of a randomized block 
layout in which only a .single plot ykdd is missing. A valid but 
rather ineffieiemt metliod of tackling this ease would be to ignores 
r;':.allTreatment::yield&:in:th0':'Uock,in;^wMcltTl^^^ 

^/'Ipcated; and-cto .eariy; ;bHt' ■ a 

of the data from the remaining n — 1 blocks. This method is 

;;..impler'bul:haETb0.pbyiout;diBadyan^ 

-of :::':repIicatidBB;:;pf "'fachv''^ 

number of degrees of freedom of error, and of markedly decreasing 

, A second but inaccurate method which has been used in the 
. past ia to make allowance in computing the component sums of 



.blocks lias^^oiio replicate less thaa, the rem.airidert:t This makes 
the 'arithmetical calculations slightly . more , complicated, than 
would hav<5 been tlie case iiad the observations been complete. 
It. has the inuch bigger disadvantage that, owing to tiie non- 
o.rthogoBal nature of the data,' it is statlstierdly , inisoi.ind and' 
inay lead to 'Mse conclusion^^^ 

. The best method of 'tackling the problem is tx> 'apply' Yates^. 
inissiiig' plot' technique'^' 'ill which' the remainder of .the data' is 
used' to. provide a logical estimate' of : the ^ missing yield.",'. In 
tills .way, the required degree- of orthogonality is recovered, and 
a. simple analysis of variarice of tlie completed data can then be 
provided an a|)})ro|iriate modifn^ation in the miinber of 
degn‘(*s of freedom alloeiit.ed to error and in tlie estimation of the 
standan'l (UTors of thc^ various treatments is incorporated. It 
is proposed to use the data from Table 55 with one variate 
omitted to illustrate the practical application of these last two 
alternatives and demonstrate that even this slight deviation 
from normality must be given due weight in the statistical 
interpneiation of the results. 

Example 30. Analysis of Data from Randomked Block 
Experiment in Which One of the Plot Yields from Block V Is 
Lost. — -Tlio data are otherwise the same as recorded in Table 55. 


Tablb:' 63.— Yield of Wheat'/ in 'Kilo.'Geams .fe'e' Plot. 


VarieticB 


Blocks 


Block total 


.'Yarlety tota!'.; 
Yariety' .rn'eaii;: 


Grand total 198 


III a randomked block experiment an estimate of the potential 
yield of the missing plot can be obtained from the formula 
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_ 11 X + V X y. — T 

(t^- lyoT-i) 

when imissiag-plot yidd. ' 

n - number of blocks or replicates, 
p ;s= '!iimiber of treatments.' 

■ ■; i;"'- block ill 'which missing plot is located. , ; 

■ ■ iottd of' remsdxmig p 1 ■plots in block !;.■ 

■ E treatment in wd'dcli missing plot ■ occurs. 

— ■ tot-al of remaining n 1 plots in treatment . ■ 

T — grand total of all available np — 1 plots. 

Applying ■' this equation to provide an estimate of the missing 
yield in Table 63 for variety C in block "V'j 

■■ 6 X 27 + 3 X 72 — 198 

This value happens to have worked out at exactly the yield 
figure reeord('d for this plot in the original data of Table 55. 
The component sums of squares including the estimate of the 
missing plot will be the same as those worked out in Table 57 
for the original anabasis of variance. When a single plot yield 
is missing^ 1 degree of freedom is used up in the estimation of 
the missing variate, which reduces the total available degrees 
of freedom to np — 2. This in turn cuts down by unity the 
number of degrec^s of freedom normally attributed to error in an 
experiment of this particular layout. The a]}propriate analysis 
'of variance of .tlie'data on This ;bask.is appended.' ■ ' 


TabuB' ■ fi4.~^-AxALVBis , or v:yA,EiA,NcH Deiuvbb. fiiom: .■'.^■‘ Mibsing«flot'* 


' ’ lector ' 

.■ Cfram.:Tabl0.57).^ -. 

■' ■’•! 

Degrees; of 

■ fmedoru 

i ■. , : 

Yariance'.'. 

.v'.. ■.'■.:1.''. A.Av:'v' 

' ' , -184 

10 


Blocks 

'■72::. 

5 


■Varieties'. ■. v /'.'■ ■, . v ■, 

■/ ■■'■:#, 

2 1 

■', 46, 'OO''**..-' 

■Etjror,; .".i . A:. 

■■'.^T'0 /■ i 

.; ■ ■■■■ ■" ■; ' . ■ j 




** Significant at the I per cent point. It shoizW he noted that, in writing up experimental 
r^wlts, ifc j« not generally eonmdered necessary to tabulate the calculated values of F or z. 
i. The customary practice is to mark variances which are significant at the 5 per cent point 
with one star, and those significant at the 1 per cent point with two stars. 
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j^;': 


The variety variance is .significantly .greater', than the error 
variance. Actually the use of the missing-plot technique tends 
to give .a treatment variance slightly in excess of its true value, 
but the exaggeration is negligible. 

.For the conipari.son of treatments, other than treatment s 
witli the missing plot, the usual formulas for .calculating the 
■ standaixl errors apply, . . 


E 




error variance 


n 


Thtis, in comparing the means of treatments A.and i?, 

& = ^^g^ = 0.84 

A significant difference between the means of treatments A and 
B is one greater than 0.84 X 2.262 or 1.90. Variety A is there- 
fore better than variety B, 

In order to compare treatment s with any other treatment, a 
revised value of Ed has to be calculated. An approximate but 
satisfactory estimation of this value will be obtained provided: 

a. The number of replications of treatment s is assumed to 
be ti — 1, even though the estimate of the missing-plot yield has 
been used in deterniining the mean of the treatment. 

fe. Only one-half instead of one replication is accorded to the 
extra plot of the second treatment in block v. 


Then 


Eb 


's/a 


1 


+ 


(T^ 


1 / 

,■ 72 . 


for comparing the mean of treatment s and any other treatment 
mean. Tims, in the example cited, for the comparison of treat- 
ment C with either A or B, 






n , 2.11 

5 5.5 


0.90 


A significant difference between means is one greater than 
0.90 X 2.262 »= 2.035. On this basis, variety C is better than 
' either .A: or 
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In order to iliiistratev tlie iieces^^ of ap|dyirig the missing- 
plot t^eeliniqne, . it is proposed: to use, the same data to carry out a 
direct analysis of varianeOj ignoring tlnv fact tliat the layout h,a.s 
becorne iKMorthogoiial. 


■TotulSB.,-: 


12^ + 132 


108^ 

If' 


14:5.9 with 16 degrees of freedom 


Block; S.S.-= 


, 33 ^ + 425 + 36 " +: 27 ^;+ 33 ^ , ' 27 ^ 


198^ 


Variety S.S/ 


3 ■ . 2, 

44.1 with, 5 degrees, of freedom 
m + 72^ . 


.Error S.S. -- 145.9 


0 17 : 

65.7 with 2 degreevS of freedom 
(44.1 + 65,7) - 

36.1 with 9 degrees of freedom 


The error variance is 


36.1 


9 


4.01. On this basis the standard 


error of the difference between the means of varieties A and B is 


X 2 = 1.16. This test would seem to show that the 


difference is nonsignificmntj whereas by using the more accurate 
missing-plot teehniquej this difference has already been proved 
significant. The entanglement of the block and treatment 
differences^ as a result of the missing plot in block p, invalidates 
;:a,'simple,'anidysiS:Of :vaiiaBce'of .the^OT 

Estimation of a Missing Plot in a Latin Square.— In a Latin 
square %vith one missing plot the same geB<3ral principles, regard- 
::ing;4&4)est':analysi8:^of:variance:tO'^ : In'thishase^: 

; the 'formula 'te estimating'dhe'inissing^^ 






where fe 


the total of the remaining n — 
with the miBBing plot. 

Tm ^ the total of the remaining n 


1 plots in the column 
— 1 plots in the row’' 
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' Theii the 5 Btaiida,rd error of the difference between the .mean' of,-' 
treatoicmt and other treatment mean. ■ / ■ 




: Analysis of Variance' When Several Plots Are Missing.— An- 

■ exteii?rlo:n of this, method of .estimating the ' 30 , 0 ^ of :a missing 
plot can be immI for data in. which Beveral plot yields are missing." 
Tl'iis is effecffnl by patting in an arlntrary approximationffor all^ 
the' .missirig*"|>lot yields except one and applying tlie .formulas 
already given for estimating the value of this one. ' Tins value- is; 
tlien 0 iif.i 3 red and us< 3 d in tlie deterroinatioii of the .value of the 
second missing, plot; and so 011 in sequence for each plotdn turn, 
the total for the table being altered in accordance with each new 

■ value, obtained. 

The whole procevss should be repeated a second tiinewvith the 
first estimates entered, reestimatmg again for each plot in turn. 
This will give yields that are accurate within 0,01 of the required 
. values.^, ■ , ■■ - 

The same general rules regarding the analysis of variance and 
tlie calculation of the standard errors, as already discussed for a 
single Idank entiy in the data, iiold good when there are several 
missing plots. When there are more than three missing plots, 
thf^ treatment variance may be significantly exaggerated. In 
mean (extreme cases, it is questionable wdiether elaborate sta- 
tistical computation can effectively take the |)lace of actual field 
data. ■'■•V.A-::':; 

, -Example. 31.— As , -an example of ■:^the;q>rocediire, .ihevanalysia; 
for-' the following' 5.-X;';5:;Tatin'" three',. miBsing^plot:-- 

: yields fe.f,-'-,and 'Is- 'given in- 
.;.Ent€?ring;,tlm;nman;;Vahie :::(Tahle- :65):: as;'an : arbitrary :;estima^^ 
of tlie yield of plot y and of plot substitution in the formula 
for a missing plot in a I^jatin square gives 


5(77 + 74 + 86) 


2 X 480 


Sufficient accuracy will be obtained if the calculations are taken 
to one place of decimals beyond that used for the plot yields. 


I 


1S6 TECHNIQUE IN AGRICULTURAL RESEARCH 
Using this value for x and the same arbitrary value of 20 for s, 



5(62 + 86 4- 84.7) - 2 X 478.7* 
12 


Then, sub.stituting this value for y, 

5(75.7 4- 83.2 + 79) - 2 X 475.9* 


19.8 


Repeating tliis proeens for eat.‘li plot in turn bo as to correct the 
preliminary esiiinateB an, yi^ jSi, 


i 





Table 65.— Layout of 5X5 Squakb anu Yield of Sugab Cane in 
' Hundbbdweiobts ' pee ;Ho"Acek ' Plot. FOE 5 yAKiETiES' il, B, C, ■ ' 

.. D . AND E , 


Total of the 22 available plot yields “ 440 cwt. 

Mean of the 22 available plot yields « 20 cwt. per plot 

* The revised total for 24 plots when the latest estimate is entered in 
the table. 


5(76.8 + 74 + 83.2) 




2 X 475.7=^ 


Similarly y on recalculation becomes 17.4, and z on recalcula- 
tion becomes 19.8. 

Obviously there will be little advantage in quoting the missing- 
plot yields in smaller units than the actual ones, and in the 
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following analysis of variance, the nearest integer for these 
estlraates has been used, E>f 2 .; 

X = 18 

y = 17 
2-20 


Tabob 06 .— -Akalysis op V.arunce of Completbd Data 


' ' . Factor ■ , 

as.' 

Degree of i 
■freedom 1 

Variance 

Total.-.. 

220 

2. 

21 1 

, 4- 


CAilojon.s- - . . 

17 

4 


Varlcticbj 

181 

4 

45.2** 

Jbrror 

20 

9 

2.2 



Sigiiifieant at the J per ceiit point. 

t A detiuciion of 3 degrees of freedom for the three missing plots. 


The treatirieiit variance is clearly significant. The variety 
means for the completed table are as follows: 

Hundred weights per plot 
' 4 ... .., 15 . 8 '■ 

. S. .... V. ... ..22.6 ■ ' 

" C. '....18.6 ■ 

■ I). .,19.0 , , 

■ ...23,0 

Varieties i? and E have apparently given liigh yields and A a low 
yield. ■ 

It would be interesting to test further whetlier i) is significantly 
iKdter than A and E than IX The appropriate standard errors 
for these comparisons are required./ Here again, the 
of replications accorded to each mean is based primarily on the 
total number of actual yields obtained for that treatment in 
the field records. Furthermore, any treatment mean is accorded 
only two-thirds of a replication for each field plot located in a row 
or a column when3 the opposite treatment shows a missing plot; 
and only one-third of a replication if the second treatment is 
missing in both row and column. On this basis, in a comparison 
of varieties ^4 and D, the number of replications of A is equivalent 
to 3J4 of D to 3, so that 
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Standard error of difference between means of .4 and D 


4 : 


2 ^ 


W, + 


Significant diiferenee is one > 1.18 X 2.262 


S 

2.69 


1.18 



■ The difference: between' means is 3.2 cwt.^ and therefore signifi- 
cant.' 'Similarly, for the comparison' of varieties 'D and . 

■Standard 'error' of difference betw'een'ineans ';— 1.15 


Thi) difference between the means is 4, and therefore also 
signifi(!-aiit. The varieties ctan now be a(xa..irai:aly graded. ■ 



B'aiid E High yielders 
C and D Average yielders 
>4 Low yielder 
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SERIAL EXPERIMENTS 

Brff>re it is advisal;)!!? to' rfK;‘omiiiend . tfie iBOclifica'tion of 
I existing iicild firaeiic^e, any’ im}3ortan.t experiment slionld: be 
rep#\‘)ted ILr aJ least, d years in order 1u b(^ isnn? iluii ibe re>sults 
I hold good under varying seasonal condit ions. The final recom- 
mendaiiuiis sliuiikl l>e based an the aceurnulated data of all 
'tiic experiintmis. There are various ways of, effecting a'com- 
^ prediensi've ■ iiiterpretatior^^ of. such, serial e-Xperimeiits. As the 
yield data, for each year will nonnally be analyzed as soon as 
the records reach the laboratory, a simple method is to make a 
resume of the individual results and use this accumulated evi- 
'deuce '^td: -assesB: the real differences , be tween ^ treatments,; '■ For^ 
■, example, ’ if one ;'variety xvas : the best iii': three ;: sik 
this could be regarded as conclusive proof of its superiority. 
; If, 'on ,ibe other hand, this variety: was' the best'in: the first; 2 years 
hut below average in the third, any statement of its performance 
would rec|Uire some seasonal qualificatiom 
■ ,:.Tliis 'unethod ■ of ■ assessing " the; results is perfectly,' v.aiid,; : but 
it do(‘S not make the fullest use of the available data. It is 
possible io (construct a singkj analysis of variance for the com- 
l)imal r(M‘ords for tli<^ sf\nsons, find ascertain from this the 
exact siguifH‘.au<*e. of the vari<ms c^ornponeni factors. In such an 
analysis the seasonal compomait will generally account for a 
■ /good .share of ' the/ total; dispersion of the^ Mriates, 

Example 32, Statistical Analysis of Combined Data from 
Two Sugar Cane Variety Trials for Years 1933 and 1934, 
;; Respectively,-: -/^ /d/.: ; 

The variance for season includes that due to the difference in 
soil fertility bet^veen the two experiments as a whole, so that the 
sum, of squares and degrees of freedom for blocks arc the aggregate 
values for the two experiments taken separately. It will also 
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Table '67.— YmLo .of Cake m Quarters ter J-Io- acre 'Plot 


Variety ' 

Exp. I (1933) 

Blxp. 11 (1934) 

Varie- 

ty 

total 

for 

Exp. I 
and 
, II , 

Blocks 

, 

Varie- 

■■.ty 

total 

for 

Exp. 

I 

Blocks 

Varie- 

ty 

total 

for 

,Exp. 

II 

, I 

2 

S 

4. 

5 

.1 

2 ■ 

3 

"A 

:5 

'St. Croix 

41 

4.4 

15 

15 

44 

'■2-19 . 

3S 

■35 

' 41 

39 

45 

■ ■ 198 

417 


52 

m 

5H 

60 

■19 

2B6 

50 

51 

48 

64 

63 

.276 

, 562 

Uba........ . 

44 

.54 

51 

52 

66 

261. ■ 

40 

37 

47 

m 

4.8 

227 

488 

B. 726,..,... 

.56 

5Q 

66 

56 

60 

282 

47 

■67 

55 

m 

46 

270 

'552 

Co. 213 

51 

. 56 

61 

64: 

63' 

■ 295 ■ 

■ 43i 

■ 51 

56 

56 

■ 72 

278 

573 

Block total . . 

iiii 

i 

205 

i 

275 

283 

276 

1343 

224l 

'i 

231 

247 

. i 
. -1 

273 

274 

' 1249 

■2592 


be mcessary to estimate the mteraction between variety and 
season to ascertain whether or not the varieties have shown 
any differentiai response to the change in climatic conditions 
between 1933 and 1934. 


Table 68,— Akaltsis op ' Variance 


■Factor ' '■■ ■ 

B,S. 

Degrees of 1 
freedom 

Variance 


3526.7 

49 


Varieties.-..^,. , ■,-, .■. ,. v. . . . . , 

1721.7 

'■■.:;, 4'^ 

430.4*^* 

.Seasons. ■.'■..■. , . ; ... .'■. . , ;■.:. , . , 

176.7 

r- h ■'• 

.176.7*^'-' 

Jiitemctio’ri : Beason. ^X varietv. . . . 

30,3 

^ ■ '4" . 

, ■'9. 1..',' 

Blocks, . .■■■. ; . > V. 

614.4 

8 

'■ 76'. '8. ■ .■' 

Errcjr. . v-V, . . . .■ , .■■. .■ 

’ 977.6 

32 

30.5 


* SigJtifscattl at 5 |.K)r pniisit. 
at I pur 


The only significant components in this analysis are variety 
and season. The iriteraction is nonsignificant, and it can be 
assumed tliat the va.rieties have maintained the same relative 
position for the 2 years. The com^ersion factor required to 
express the variety totals in tons per acre is one-twentieth. 
The mean yields then become: 
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Tons per a,(?re 

Co. 213. . ......................... 28.65 '±^ 0.873' 

BJl 28.10 ± 0.873 

B, 726 ■.... 27.60 ± 0.873 

Ilba,... 24.40 ± 0.873 

St. Grok . . 20.85 ± 0.873 


proving that St. Croix is thc3 jiiooinst' with 'lJba next, while the ., 
other three varieties are all equally good' and definitely superior, 
to these, two..', 

ThO' variance for season is also significant. As 'the'^ test 
lias giveii a positive 'result and there are only two- seasons, there 
is, no need to calculate 1 It' can be stated without further 
|)reamble that tlie 1933 season has been better for the cane 
crop tlian that of 1934. I'he use of the variety X season variance 
as an estimate of error, so as to widen the scope of the conclusions, 
has already been discussed in the previous chapter (page 175). 

The analysis of variance of the combined data for experiments 
repeated for 2 years or more is in line with that given in Example 
29 for three Latin squares. That example records the results 
from 1 year’s experiment consisting of three Latin squares, but 
the statistical analysis would be exactly the same if the data 
represented 3 years^ expeiimeiit with the same treatments and 
one 3 X 3 Latin square in each season. In the latter eventuality, 
seasonal effects w^ould be entangled wuth the fertility differences 
betwmm the soils in tim s squares, and the variance for 

season would really measure the combined effects of these two 
fa,ctors. It would not be possible in the analysis of variance to 
segregate the seasonal effects and those due to fertility differences 
between . i:lie squams.,': If , howevta*, ' the 3 years.^ experiment' 
repn^scnie<3, in oacln season, nien^ly a new randomization of the 
treatments on the sanui site and on the samcj phds, it. could be 
.safely ' assumed that tiic seasonal factor was the one chiefly’ 
respoiLsible for the variance attributed to season in the analysis. 
However, such a layout would tend to make the conclusions 
rather less general in application than those obtained from 
a 3 years^ trial using three different sites. 


', : EXPEEIMBMS' mTH',PEimOTIAL;:'€EOPS,’, ',, 

Perennial and semiperennial plants such as orchard and 
plantation crops, sugar cane, bananas, tropical fodder grasses, 
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etc., present to the field experirncntalist, additional problems not 
cneoimiered in dealing wit.li tlic ordinary annual arable croi)s. 
The extrenie type of these perennials is found in the fruit orchard 
where tlie yield data come from a limited number of relatively 
large irec^s. Tiie threes tJieinselves are generally far from uniform 
in tlnar goneticai composition and, consequently, also in their 
potential yield capacity. In any old orchard there will almost 
; certainly , be several ■, age classes. ■; Even'/ where dhe trees are all 
of the same age, it will be found that some bear early and reach 
their inaxirnum quickly, wliilc others may be slow in maturing 
but continue to yield la'iivliy fm' a much longer period. The 
tniCB are widely s])ac('{l, and therefore only a, relatively small 
number <!an hv included in a single ijlot, as apart from the 
qiicsiioii (if acreage available, if 1i,ie plots are made too large, 
the major soil fertility differences within the blocks will counter- 
act the advantage gained by increasing the number of trees per 
plot. Even where the number of trees is reduced to a minimum, 
the plot siae at an average spacing of 25 feet will be in tlie neigh- 
borhood of Ei 34 the effects of soil differences within 

plots will be considerable. The root spread per plant is extensive 
and makes the inclusion of nonexperimental border trees essential 
to avoid edge interference. The skill of the field staff as regards 
pruning, harvesting, spraying, etc., has a decided influence on 
the total yield given by any one tree. The crop is a perennial, 
and the differential response of the individual plants to the vary- 
ing weathc^r conditions from year to year introduces a further 
uncontrollable variation factor. The yield data alone do not 
necessarily measure the whole 6ffe(;t of any particular treatment. 
The quality of the produce is often fully as important as tlm 
ciuantity. The rate of growth, root spread, susceptibility to 
disease, etc., may bo gn^at.ly improved without any immediate 
effoid being reflectc*d in the yield data. 

In designing an expcrim(uit fur a perennial crop, the following 
factors all require careful consideration: 

Plant ITniformity. — Wherever possible, seedling trees should be 
avoided for experimental purposes. The vegetative method 
of propagation is much more likely to produce similar types 
of tree, and the planting material should come from the same 
parent stock. Where grafting or budding processes are involved, 
^very effort should be made to standardiae both the rootstock 
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and the scion. Where the experiment is to be superimposed 
on established plantations^ a locality in which the trees are all 


of one. age class should be selectee 


Size of Plot.— On account of the wide spaeing' required by 
most, orchard. cropSy plots of larger size thaii those , recommended 
for annual crops will usually be .tmeessary.'- Each plot slnuild 
contain ,10 or more trees;, with ,an aAmrage, orchard 'spacdiig, this 
will give a plot size, of tapproxirnately 34 ^ 

Layout.^ — ^Probably the Latiai square to be prefeiTod^ as 
anyrone experinient will extend over 'a considerable area of land, 
and this liiTaiigement ruost effectively 'reduces the error due to 
soil for differences. ^ 

Cultural Treatment.— The trees should be given parallel 
treatment as regards pruning, harvesting, soil cultivation, etc. 
Even the difference in skill and care between two expert primers 
or pickers may be sufEciently large to produce a significant effect 
on tlie ultimate yields. 

Border Rows.— Nonexpeiimental guard rows between plots 
will geiierally be required to obviate edge interference and prevent 
any one treatment from affecting the growth of the trees on 
adjacent plots. The most efficient system, but one wliich utilizes 
a large area, is to establish a separate guard row around every 
plot, so that between the effective plot units there will always 
be two guard iwvs. In this system, each tree in any one border 
row is given the same treatment as the plot to which it belongs, 
hut the yields of the border trees are not entered in the experi- 
numtal records. With wide-spaecal orchard crops, the area of 
land occupied by the guard rows will often be greater than that 
of the actual experimental pilots. In this coiiueetion it should 
f>c noted that' ihe greater the number of treses in the experimental 
plot the smaller will be the proportion of land under liorder 
; TOWS.;. ■:To.::A|uqt6:taE ,extr6me:-exainiffe,:'.::to: ispMo;^ 
';pIot;::madfe;:jip';T)f;a;a will- require^ 8;; .border 

rectangular layout, or four border trees on a triangular or 
quincuncial layout. Whereas, on a square plot of 16 trees, only 
2Q guard trees will be necessary, reducing the proportion of 
tborder'- : to 'texperimehtal'tplahte 
':;the:si&pe:qf;tEe.p!.at';islinportantv'..V/ 

with 20 for the same number of trees on a square layout. A single 
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row of 16 trees would need no less than 38 border units. These 
features are illustrated in diagrammatic form below* 


o o 

QuiiiouDcial 

layout layout 

Single tree plot. 

O D O 0 0 o 

o 
o 
o 
o 
o 



I 


Z 


o o o o 
lleotiingular plot. 


In certain circumstances, where the Land or the number of 
trees of a given age class is linoited, it may be necessary to 
compromise in order to obtain the requisite number of replica- 
tions. One method of doing this is to allow for only a single or 
common guard row between plots, choosmg a type of treatment 
intermediate in character between those being tested. For 
e.xample, in a variety test an entirely different species might be 
selected for the guard rows; this has the added advantage of 
effectively outlining the experimental plots. When the use of 
a single guard row is considered practicable, the nonexperimental 
area will be reduced by almost 50 per cent. Another com- 
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■promise is to plant ■goard. rows ■ along only .the i;wo sides of each 
plot'j so .'that every tree is snrrouiided on at least three sides by, 
similar units. 

Records.~Most autliorities recommend tlie separate recording ’, 
o'f tl'M3 yi.eids. of every single tree in the experim,ent. Many 
■pure!3^ commercial growers keep such records in order' to allow 
tiienr to a^iaidicate iionpiotltabk^^ ^delders. ^ In experimental 
■ woiicj individual tree records' .make it possible to ascertain the 
variation witlrin plots and ma^?- even ch^monstrate the advisability 
of idii'niiiating particular blocks showing undesirable hetero- 
geneity. Yield data' for each tree' o’tten prove ■ invaluable in 
planning further experiments. , It is bomelxmes possible , to 
use -such previous rec^ords to work' out the analysis of cov’arianee 
in ord('r to effect a valid reduction in the estimate of the error 
vari,an.ce by means of the regression function (Chap. IX), 

Most i,rees have a definite fruiting period during the year, and, 
in ('onsequence, the ^deid data can be grouped according to 
year as well as according to treatment and block. In orchard 
crops where there is a tendency to y’ield heavily every second 
year, siatistical analysis applied to the combined jdeld of plots 
' fortwTj cKinsecutive': harvests has '•obvious advantages."^' 

The yield data alone represent only a small part of the total 
information that should be collected. In many experiineiits 
other factors are equally important in estimating the effects of 
treatment, cup, girth increment, spread of the tree, amiual 
wood growiii, fruit-bud productivity, and the percentage of 
shedding, etc. Observaitions regarding tlie general tone and 
vigor of the tn^es, the prevaleiK‘.e of fungal and insect attack, 
and oilier similar cuHiiral details are also of value. All these 
fioints serve to emphasize, the fact that expcu’imeritaiion. with a 
permiiiial crop must be tackled in earnest from th <3 outset. It is 
a hnigtliy, expensive, and often disappointing task. Before 
conclusive ;; 'results : caU' ■: bo ’'Oxpected, ■■" the' experiment', ihust , be,' 
carcifully designed and efficiently supervised from start to finish. 
Whatever system is followed, the cost of experimentation will 
be cionsiderable, and it is essential to Justify this outlay by 
scrupulous attention to detail, which is the only way of approach- 
: :ihg '■ theuiltimate: 

Classification of Produce.— The bulk yield per plot is of only 
limited utility. The produce after harvest usually has to be 
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classified into first., Hoeoiid, third, and scrap grades; the statistical 
analysis may be limited to thev total. yM per plot 

of market al>le in-oduee, but an estimate of the percentage of each 
quality in the salable fruit is also essential if correct conclusions 
are to be formed. With many crops chemical analysis may be 
.deemed; nece.ssary in order; to' assess ' the' relative quality of the 
produce.' 


SXATISTICM; AHALYSIS, OF BATA FROM PEREHNIAL' ■ CROPS 

, '■ ..Mt^st'experinieirts' with perenn^^^^ crops' will 'have, to continue 
for a period of several years in order to permit of an accurate 
comparison of the variuiis treatrnents. An anal3^sis of each 
. season^s data, separately is perfectly validy and'the accumulated 
■■evidence 'thus obtained may be. satisfactorily conclusive. , As 
n general rule, however, the complete data for the whole of the 
experimental period should be coordinated in a single table, 
and to tiiis extent the records resemble those for serial experi- 
ments. With perennial crops there is an important difference 
in that the randomization is not changed, and the yield data 
for each successive year are for the same plots and the same 
plants. The statistical analysis has to be modified accordingly, 
because a series of successive harvests or pickings from any 
plot does not entail aii^- increase in the number of replications 
on wduch the measure of plot variation will be based. This 
principle must not be forgotten in the application of statistics 
to the results. On the other hand, the total yield from any 
plot is divisible into so many separate subunits according to 
season, and it mayd^e im]3ortant to asses>s the relative response 
of treaiimmt to season. The analysis of variance should there- 
fore be of the error (a) and error (h) type, in which tlie former is 
■ '! used ;;icj assess'; tlie. ; aggregate' . diffiaence .between ' treatmente' ■ for., 
the whole period covered b}^ the experiim3nt and the latter to 
^ measumAny 'dififerehlialT^^Bpqnse of ..ihedreatinejit^:^ yarying' 

seasonal conditions. 

Example 33* Statistical Analysis of Two Years* Yield Data 
from a Peretmial Crop*^ — To illustrate the difference in the 
statistical technique, it will be assumed that the data from 
fqr 'a ■Seri'at^oxperiment.repr#^ 

. plots in 1933 and 1934. The analysis of variance appended 
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Table CO.*— Tijsld of Pl.* 

NT Cv\ 

mi aisueFies 

t IIatoon C 

aopsnsr Quaetbes 



PEE 


:TiE F 

LOT 




, 

' 



Blocks? 




Variety ■ 

Harvest j 


— 

— - 



Total across 


1 

■ ,1 . 

Cp 

3 

■ 4. 

5 


St. Crok . 

1033 i 

41 

44 

L.f 

45 

4.4 

'219. ■ 


1034 1 

3B 

■■35 

■ 41, 

SO 

■.-.45 

lOS 


Total 

j 

7B 

79 

80 

■ai 

SO 

■ 

■ p i<y 

.v'iJi . 



19S3 ' 

m 

61 

.58 

60 

' ■ 49 

■ .286 


.1934 

50 

: 51 

.1.8 

O'i 

63- 

276 . 


Total 

i02 

/ l:J 

loe 

18G 

im 


ijim.. 

1933 

44 

54 

51 

'■ 52 

■ '60 

261 


1934 

■ 46' 

37 

47 

4.0 

48 

227 ' ■ 


Total 

90 

91 

98 

wt 

. 

108 

. 4S8 , 

B. 726 

■ 1933. 

56 

50 

60 

1 . 56 

60 

282 


1934 

. 47 

^ ; 57 

^ 55. 

' 65 

46 

270 


Total 

■ 109 

lOT 

JIo 

mi 

m 

■ SS9 ■ 

Co. 213. ...... 

■ 1933 ■■■ 

51..! 

■ 56 

.-SI 

; .64. 

1 '^'Sa . 

295 


■ 'liM ■ ■■ 

43 

51 

.56 

I-:' :56. 


278 


Tokd : 

H \ 

'LOT, 

117: 

: .tm- 

1 iSB- 

373 

Block IMaL .... 


40S 

490 ' 



1 ooO 

9^393 grand total 

Block -X season'} 




V 1 

1' . ■■ 



. total vi 

1933 

■:244'' 

205 

■ 275 ^ i 

:283-' 

'mi' 



1934 

.224 ■■ 

J . . 

231 

; . , "yj 


■2734 

. i 

: 274;' 

.l,24r-:;::.y 


' A . si:mpIe\randomi55ed 'Wooli: 

' of aggregate' :yie^ th0^25: plots 

riiarvests, m;iil4\deteririitte11io''1)est.;varietiespa 
of the two seasons^ crop. The yield of tlic ratoon crop, relatives 
to that of the plant canOj is of considerable signifieaneej as the 
niiiiiber of years the crop should bo left in the field before replant- 
ing becomes necessary depends on this factor. This makes it 
advisable to carry out a more detailed analysis of the complete 
data in order to obtain an accurate statistical evaluation of the 
relative behavior of the varictieB for each harvest. It : will 
be assumed that the student is capable of using Table 69 to calcu- 
late any particular component of the analysis appended, the 
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units in, which the results are expressed being the plot yields 
of one season. This means that the sums of squares and vari- 
ances of all the factors in the analysis, including error (a), are 
expressed in subunits. 


Table 70.-— Analysis ov Variance 


Factor' 

S.S. 

Degrees of 
freedom 

Variance 

Total for indi'vitlual harvests. . .......... 

3.526.7 

2.676.7 
SS8.7 

49 

24 ' 


1933—1934 oggregato yields. ............ 


, ........ \.v 

4 

m .B '■ 

VdriHiiis 


4 

i6 

Error {d) 

. 408,8 1 

176.7 

Season 

1 

176.7'* 

Interactions: 

Season X variety 

36.3 

. ■ i 

4 , i 

9.1 

Season X blocks 

67.7 

4 

16.9 

Error (5) 

669.3 

16 

35.6 



Significant at 6 per cent point. 
Significant at 1 per cent point. 


If this analysis is compared with that given in Table 68^ it 
will be seen that the difference is only one of allocation of the total 
dispersion between its various components. The aggregate sum 
of squares for error (a) and error (5) is identical with that given 
originally under the error factor. The sum of squares for blocks 
added to that for the interaction of season X blocks is the same 
as the block mm of squares in Table 68, The other factors 
are unchanged. 

In assessing results on this new allocation, error (a) should 
l)e used for the comparison of values derived from the aggregate 
plot yields for the two seasons, which in this example will be 
the block and the variety totals. In calculating the standard 
error of these totals, it is important to remember that the analysis 
is in subunit values and that each total represents the aggregate 
of so many subunits. Error (b) is the correct estimate of variance 
for the other factors in which the comparable means depend 
on the way the total plot yields are apportioned between the two 
harvests. The hypothesis that the data are from tw^o successive 
harvests of a perennial crop introduces only a minor change in 
the final conclusions. On the new basis, the block sum of squares 
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m significant. The relative valueS: of the five varieties remain 
mmlteretl 

Example 34. ■ Statistical Analysis of Three Years’ Data^ from 
.Mammal Experiment on Tea. — -This experiinent was .designed 
to 'test the effectj on the yield of tea, of sulphate of ammonia 
and iiiiiriate' of potaslij alone a'nd in combination.. The former' 
was applied at the rate of 0, 1, or 2 hundredweights, nnd the latter 
at 0 and .1 hundredweight, per acre per ammm. ; There were 
therefore six tr<3atment types in all as shown at' the ■ head of 
Table 71. The layout .^vas on the ,ra,ndomized block principle 
■with, .a tot.al of five blocks. Tlie plot size was If o acre. The 
results are simirnanzed in liable 71, tlie yields q'lioted representing 
th.e total of five plots or replicates. 

Table 71 ."™Teeatment Yields o.f Tea in Pounds of Dev 'Matter 
FEE Five Plots 


Beaaoiiiil 

total 


Season X nitro- 
gen totals ' 


Season X 
potash 
tatai 


jlNoIO)' NoKi |NtK« NiKi NzKs N2K1 


Nc Ni N2 I Ko Ki 


j ' 254 251 ' 25S 257 .259 ■ 2G3 1,542 I ' 505 ' ' ..515 5221 771 . . :'771 ' 

1 400 439 471 485 470 .512 2.849 |1 905 050 98s||i , 413 1 ,430 

I 280 ' '291 ■320' 355 300'; 3811 2,035 .'|' 577 '081 ■. 777j|l,,008 1 ,027'' 

. 1 1 , 000 ^ 98 1 i . 055 1 , . 007 1 ,131 1,1 56 Ciraml ' jl 1 , 087 2 , 152 2 , 287 jS .192 3.: 234. 

i'- ■ 1,0.87 ' .■■2',l.i2- , ' 2, 287 ' t>,420 | i ■ ' "/ I 


As the individual plot yields are not. given, it is not possible 
to calculate the blo{?k and block X season interaction from the 
.data in’ the'rtable.;'' Th'e^:sums^ of squares.' for. .these Tactors: have' 
'been,.n.SBessed;boi.n.t,he originalresults.and':Will..hav'e to’ ^be: taken:' 
■ on '.trust, '". The analysis ofwariance i-s otherwise quite ’straight- 
forward. For example, 

. 3,192^ + 3,2342 6,426^ 

:■ '6.S. potash — . — 19,6.:' 


;'S.S.. seasons 


1,542^: ..+ '2,849^ + ’2,035.2';;' : ' : 6,426^'^' 


iBii 

■11 

lliP 

!?ilss»»aw 

Jli'jitS: 


■li 


KiSsiaS-iS 

iwafeiSsS 

rsfisafiSs: 


mi 

iiliiisS 

lliispaslg 


mm 

iiiis 

Riir" 

■f* 

i 


■liiS; 

iiiiiis 

lai 

1'"“ 

ligif 

Uliiiil 

■lilill 




B 

SiilSffll 

iSilSSl 


= 29 , 043.3 
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Interaction; season X potash 

77 P + 1 , 413 ^ + 1 , 008 ^ + 77 P + 1 , 436 *> + 1 , 027 =^ 

_ . _ 

6 426^ 

„ ( 19.6 + 29 , 043 . 3 ) 

= 10.0 


T- 4 BL 13 72 .~AN.'inTsis op V.^biance 


' , .' Factor ■ 

S.S. 

Degrees of 
freedom 

Variance 

Total. ... . . . . 

36,460.3 
4,209.5 
1,630 M 

89 


Ap'p^rngfitc vicklf? for 3 vears . 

29 


Bheks, ■ 

4 

40T.5**- 

75S.S** 

HiifoQfM: ......................... 

i,rmM 

■ 2 

Potash. 

19,6 

1 

19. a 

i iiitTiictioiv N X 

80.9 

2 

\ 40.6 

48,7 

14,521.7** 

MrroT '{a) . . . 

974.0 
29,043.3 

909.1 

20 


' .■2. ■ 

Season X blocks. . ................. 

8 

113.6** 

Season ’ X N . . . . . 

861 . 1 

4 

: ' 215.3** 

Season 'X K . 

.10.0 

.2':- / 

[: :';5.o. 

Season'' X'N. X IC. . .... .■ . 

'223.3' 


I'/, '■ '■■,55.8. ■ 

Krror '(&) ■ ■ . . 

1,204.0 

40 

I' :''-'30;:l'" 



** Signifkulit at the 1 per oertt point. 


The analysis shows that the seasonal factor is the one responsi- 
ble for tlK3 major portion of the dispersion shown by the variates. 
The 1933 season has evidently been a very poor year and ilio 
U)34 a very good one for the tea crop. The arrangement in 
randomigjed blocks lias been of considerable benefit in reducing 
thei;' 'effects foil; 'heterogeneity.^: ^ 

;of ::the;,8:; years^; there: is' a marked; resp 0 pe:::tb:increaring: doses;-'^^ 
■nitipgcm.:'';\;'Aa ThO: 

/thisyrespoase: has :ev|dehtly: not 'been; thdfam^ 

: A; differenc^etweeh'the/f easpn 
:Xvl0; :>< ' 2 ■; A At. 

there has been no response to the sulphate in the first year^ a 
fair response in the second year, and a very marked response in 

gradual improvement in the general vigor and growth of the plant 
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aad not, merely a temporary response to the additional nutrient.' 
The' potash has been entirely ineffectual. , . 

ANALYSIS OF GROUPED DATA WHEN DIFFERENT ASSUMED 
MEANS ARE USED FOR EACH GROUP , . 

Tlie ad\^onta.ge of using an appropriate assumed mean in order 
to rc^diice the ■amount of a.rithin.eti€al calculation in the^coniputa'- 
tion'of' mi analysis of variance' has' aimidy been noted. With, 
aecumiilated' data from perermiar crops or 'from serial , eKperi-' 
moots over severai. yeai-s, the variation in the average'- yield' 
from season to season is often so great that it will annul .much, 
of tlie advantage of using the ordinary asBunied-niean technique. 
For example^ lor the yield data of 'Table 7F un assiiinedmiean 
value of 360 pounds might be selected as a suital^Ie'approxima-*. 
tion to tile true nieaiij but the variation in yield from '1933 to' 
1935 is such that the deviation of 'many .of the recorded' yiekk 
from t Ids assumed mean will run to three digits and the 'squares 
of these deviations to five digits.' . It is .obvious that, such large 
deviatioiis Avoiild bcr .'avoided if a d.iffereiitj but ' appropriate, 
assumed mean was used, for the data of each year.. , This. may. 
actually be done and, proxdded a suitable modification in the 
statisticail procedure is introduced, tlie detailed analysis of 
variance, as tabulated in Table 72, may be derived with con~ 


Ta.bl' 13 73.“— TRBA.T.M:iiixT Ymuus' OF Tea. in Povn'ds of Dry'. M.a.tteb'' fee. 
' Fite Plots, around; Assumbd«Mean A^ALims,' A fpeopriatb ; 

'TO Each Season.,. ^ ' 


, , ■ 

hun 

; Ab- ' 

' sniffled 



Tn utniriiiiB 



{;on 

! Ss^MSDii ' X' xiitrogeii 
total 

■ ,''S««soJi,X4 

IKitfusli total 

liH Uli 


XN'jKr. 


■'mu 

■Nslio 

mu 

tditnl 

' 

Ni, 

Ni 


;■ m' 

Ki 

mm 

1 :'2G0„ ■ 

iJj 

^ 

9 "■ .' . 'I 
1 

.. ...p 

2' ' 

*~ ■ 4- 

. 

r- 

-'"■4 

- 4 

18 

- -i 
15 

- '4- 

5 '-". 

4' 

-■'"■■+■ 

9',: 

4 

:..T93.4''' 

['■'.47(1:' '■ 


31'',"'. ■: 




-m 

4 29 

35 

'l'6 

4s 

', 3 

.. 20 


320 

M 

20., 




61 

4115 

m ' 

41 

B7 

■:,.48 

67 

Total 

'"'.j 

-09 


|"'^:4-47 

+8l| 

; 4“io6 

Grand 

total 

+126 

+113. ■ 

453 

■' '^4187 

442 

+S4 
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siderable reduction in the amount of routine arithmetic. As an 
illustration of the statistical technique, the data from Table 71 
have been rearranged in Table 73 with the yields recorded 
around assumed means of 


260 lb, for the year 1933, 

470 lb. for the year 1934, 

320 lb. for the year 1935, 

When the evaluation of any factor in the analysis of variance 
is derived from treatment totals representing aggregate values 
for all three seasons, the arithmetical procedure is exactly the 
same as that adopted when only one assumed mean has been 
used. Remembering that the yields recorded represent the 
totals for five plots or replicates and that there are therefore 90 
variates in all, the 


General C.F. - 
S.S. nitrogen = 
S.S. potash = 


Grand totaP _ 126^ 
90 “90 

113- + 522 + 1872' 
30 

422 + 842 , . ^ 

45 


- 176.4 
• C.F. == 
19.6 


1505.0 


Interaction: N X K 

442 + 692 + 52 + 472 + 8P + 1062 ^ 

(1,505.0 + 19.6) 


« 80.9 


These sums of squares tally exactly with those already evalu- 
ated for the same factors in tlic original analysis of variance 
given in Table 72. It is not possible to use the same technique 
to evaluate factors in the analysis of variance dependent on the 
distributioii of tlie treatment totals between the three seasons. 
For example, the seiuson sum of squares is not 

as the three totals used represent values around different assumed 
means. It is best to evaluate this sum of squares from the season 
totals given in Table 71 or, alternatively, to compute the actual 
treatment season totals from Table 73 
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1933 season total 

1934 s^^asoii total 

1935 season total 
GmB,d total . , . 


6 , 426 ' 


1,542^ +■ 23492 + 2,0352 ' 6,426' 


29,043.3 


, Interactions —The treatment X season, interactions . niayv be 
(:‘oinpiited 'direcTlj^ from tlie data of Table 73'. For example, ' 


iiiteritciion: sc^asoii X K 

— total season X potash effect 


(S.S. season + S.S. potash) 


Total season X potash S.S, = season S,S. + within-season 

potash S.S. (aggregate of three seasons) 

Therefore, 


Interaction: season. X K ', 

aggregate within-season potash "S.S —S.S. potash 


The potash 'soni of >sqtiares has alread}^ been computed, and the 
first term' can be e\mlimted from Table 73, as' follows: 


WiiMn^season Potash S.S, 


,S.S. K, ^ 1 933 ^ season 


'S.S. r K,' ;1934 season: 


S.S. K,' 1935:aeaspn.— 

Aggregate \?ithin-season 'potash' S.S. 

Interaction; season X pptasli:X=. '29.6 — 19.6 ■ 

- 10.0 (as originally calculated) 

In tlie same way the season X nitrogen interaction may be 
calculated. 


2D4 
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Within-season nitrogen ^S./S. 


5.5. N, 1933 season 

5.5. N, 1934 season =, 
,S.S.:N, 1935' season -' 


15 ^ + 5 " + 22 

18 = 

10 

30 

35 = + 16 = + 48 = 

29 = 

10 

30 “ 

63 = + 41 = + 137 = 

115 = 

10 

' 30 " 


M6 


350.4 

2 , 001,1 


Aggreg%ato 'withiiirseason, nitrogen S.S. — 2,366.1 


IiitoTOetion X N 2,366.1 — , 1,505.0 — 861.1 

Similarly, the second-order interaction, N X K X season, 
may bo crilenlated by subtracting the total of all the ireatnient 
components of the analysis of variance from the aggregate within- 
season treatment sum of squares which, on calculation, will be 
found to be 2,690.9. 


Interaction; N'X K X season, 

- 2,699.9 - (1,505.0 + 19.6 + 80.9 + 861.1 + 10.0) - 223.3 


This use of several assumed mean values in a single analysis of 
variance is not limited to data from perennial crops but may be 
adopted with advantage in many complex experiments in wdiieli 
the data can be ailoeatad to certain distinct groups showing a 
wide range of values. It is particularly useful in the final cal- 
culation of serial experiments, as it makes it possible to use the 
assumed-mean method in the analysis of each year’s data, 
independently and then to coml:)ine all the accumulated data 
into a single comprehensive analysi.s of variance with the mini- 
mum amount of recalculation. 


RECENT DE¥ELOPMENTS IN:FIELJ> ■ : :: 

EXPERIMENTATION ■ 

■ COMPLEX EXPERIMENTS 

■ The teiideiiey in ficiilcl 0 xperi.inents tod.a,}’- toward somewhat. 
(X)mpiieatcMi designs in, wJiich sevf3ral. problems' are investigated 
ill a single large-senle expcaiuxait.. llie advantages io be derived 
frorn tlie'aiiaJysi.s of relatively complex data covering an extensive' 
field of research have. /already been discussed at the end /'of 
Chap. TL In sigriciiltiiral research, the number of different 
problems requiring attention is "practically unlimited, and the' 
utility of complex field experiments will largely depend on the 
careful seietdion of suitable combinations of treatment series. 
To facilitate the statistical calculations and ensure a valid 
interpretation of tlic data, it is often of advantage to adopt a 
factorial design in which several treatment series occur in all 
possible combinations; this sliould preclude an unbalanced or 
iionorthogonal layout. For example, in an experiment in wdiich 
two ^^varieiies A and B and. three iortilizers .X,. arct^toJaa 
heSted, ' the treatoents wbuld 'be 'sixuii: numbers, ■ il'X, ^ ^ 

A. Z, : 'BTj : and; , BZ . ' each one ^ being ,. replicated ' n>, \ timeB vand 
..located' in the/field' ^ in .accordance with .any ■ of "the ; standard plot 
tlesignsv; ' This; ■' arrangement ■ does ' not . necessarily /mBan/; that ■ 
the tmatmentB- i.n : tlie^ experiment, have ' to' be. ..limited ^ to '. - those 
fiuhraccil by the factorial scheme. In a randomized block 
layout, additicmal trcaimenis entirely separate from the factorial 
siuies may be validly included. The analysis of variance of the 
treatments within the factorial scheme would not be affected 
by^ the extra treatments, except in so far as the increased number 
of plots in a block makes for inefficient control of the soil hetero- 
geneity factor. The secondary treatment comparisons should 
be dissociated, in the analysis of variance, from those in the 
factorial scheme, and the type of factor selected should be inde- 
pendent of those in the primary treatment series. The more 
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1 


complex the designj the greater the need to prepare a skeleton 
analysis of variance before starting work in the field,' A pre- 
liminary analysis of this nature should show whether the pro- 
posed plan is likely to provide a satisfactory answer to the various 
problems in their correct order of importance. 

The analyses of complex experiments are again merely special- 
i;5ed examples of the analysis of variance. It is not possible to 
quote a standard form which will be truly representative of all 
typcjs. Each experiment will have to be considered on its own 
merits. The succeeding two examples should illustrate the 
various components requiring evaluation in the statistical treat- 
ment and should demonstrate liow the complex layout does tend 
to enhance the results.' 

Example 35. Analysis of a Fodder Grass Experiment in Which 
Treatments Comprised Four Cutting Rotations and Three Varie- 
ties of Grass in All Combinations Giving 4X3 Possible Treat- 
ment Types.— The layout was on the randomized block principle 
with four blocks of 12 plots each. The treatment series were as 
follow's: 

Varieties 
Elephant grass 
Guatemala grass 
Uba cane 


Rotations 

A Cropped every 45 days or 8 times per year 
B Cropped every 90 days or 4 times per year 
C Cropped every 120 days or 3 times per year 
D Cropped every 160 days or 2 times per year 




2,101 


3,632 


4.361 



DEVELOPMENTS IN FIELD EXPERIMENTATION 
Table 75. — Analysis' op Yariancb 


The treatment variances are ail/ obviously significaBt. ' An 
effective method of summarizing results is to draw up a two-way 
table showing the various treatment means which the analysis 
of variance has indicated as being significantly different. In 
large-scale factorial experiments several separate tables of this 
type may be required. They should preferably be recorded in 
recognized commercial units, and the appropriate standard 
errors slitmld be entered. The numerical value of the standard 
error depends on the number of plots associated with the treat- 
ment mean to wdiicli it refers. Separate standard errors will 
generally have to be evaluated for the two types of main 
tabulated in the marginal entries and for the entries in the center 
of the table from which the interact ion varianc‘e has been deri 


Factor 

1 ■ ' S. 

s. 

Degrees | 
of free- 
dom 

Vari- 

ance 

Total 

Blocks. - 

460,970 

2,808 


"'47 ■ 
3 ' ' 


\%rieties.V-T-. . 

163,388' 

1 1- i ' 


[si ,Ci94* 

Eotmioiis|! 

104, 650 1 


'3.® 

o4,883* 

Interaction: , , 

Variety X rotation ■ 

95,02lJ 

f 'O ^ 

■■ a ■ 

15,988^*'^ 

Error. . ' 

34,137 


■. 33' 

1 1 ,034 


Significaia lit the I point. 


Tabli3:.,76.— Mean' YmiAJS ■ op . Dey Matter 


Elephant, grass — 

Guatemala grass. . 
‘Uha c ane.'. . .v. 

Rotation mean . 
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The appropriate two-way table for this fodder grass experiment 
is appended. 

Using three times the standard error as the critical difference, 
it is obvious from the right-hand marginal entries that the 
varieties vmi be arranged in the following order of merit on the 
average result ' of four different ■cutting rotations: ■ ■ 

■- a. TJba eane 

h. Guatemala grass , ; 

e. Elephant grass 

The rotation means show that the 90- and 120-day series {B and 
C) are iiiterincdiate in yield potentiality between the short 
rotation of 45 days, and I), the long one of 180 days. Series D 
has given much the highest ^deld of dry matter per acre, as 
assessed from the average or aggregate response from all three 
varieties of fodder grass. In contrast to this general effect, 
the significant interaction shows that with the elephant grass 
there is a significant drop in yield from the C to D series, while 
with the two other varieties there is a significant rise which is 
particularly marked in the case of the Uba cane variety. The 
difference in the length of the cutting rotation between the B 
and C series has not caused any significant alteration in the mean 
yield for any of the three varieties. For each of the grasses 
separately, a 90- to 120-diiy cutting rotation produces a signifi- 
cant inpuase in yield over the 45-day rotation, f.c., Series B and C 

This eompletcs the summary of results. Any of the con- 
clusions noted may be immediately verified by reference to the 
two-way table of mean values. In writing up experimental 
results, tables showing the significant treatment mean values 
with the appropriate standard errors are the only ones essential 
to an effective presentation of the data and conclusions. In 
more complex experiments it might also be advisable to include 
the analysis of variance table as the simplest method of demon- 
strating the experimental design, the nature of the error variance, 
and the response of the nonsignificant treatment series. For 
future reference, the actual yield data might sometimes be 
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. SPLIT-PLOT BXPBEIMEHT, 

la ail' experiinaiit of the complex type, it is^ often advantageous 
to use a standard field arrangement with nhatively large plots 
for one series, of' treatments.' By subdivision of these whole 
plots into so many similar subplots,, a second series of treatments " 
may l>e superimposed on. tlM3 first, 'rite nirmber of .subplots,' in; 

■ eac.h whole plot should be niadf3 eciuaJ, to the nii:iri.t>er of treatmei'its- ; 
in, the second series, and each of tliesc? troa linen ts sl'toiild oecmr 
once and once oiily' in each whole plot, tlie allocation over the 
siib.plots being a random one. Tills will ensure a balaiiecHl layout 
ecf'vering ail possilfie ■cornl'rimit'i.oiis. of the treai.iiientB in the 
' series : and permitting of a straiglitforward and valid statistical 
int(M'pretatioii of tlie results. For certain types of expeiixnent 
this split-plot design may greatly simplify tlie field practice. For 
example, in comparing different depths of ploughing, relatively 
large plots are practically a necessity if the ploughs are to do 
accurate wa>rk. Similarly, treatiTieots that have to be liarvested 
on different dates are much more accessible, if a reasonably large 
plot can l^e cut at one time. 

In contrast to the complete randomization of treatment 
types as described in the previous example, the split-plot system 
provides a more critical comparison of the subplot treatments 
but a less critical comparison of the wholi^-plot units. One 
reason is that the number of replications and conse<|uently, the 
number of degrees of freedom pertaining to the estimate of error 
is much greater in the former than in the latter. Furthermore, 
the large size of the wlxole plot gives less efficient control over 
Hoi,l heterogeneity. It is for this reason that, in a split -plot- 
experiment, a Latin-scpiare arrangement of the whole plots is 
i,lefinitcly pr(?ferablo to a randomized block layout. In any ease, 
the less important treatment comparisonB should be allocated to 
the whole plots, and the treatment series for which a really 
critical test is desired should be located in the subplots. Alter- 
natively, in experiments in which all the treatment comparisons 
are of equal importance, the whole plots should be used for those 
likely to show relatively large treatment differences. 

Just as in perennial crops, a number of successive harvests 
does not increase the number of degrees of freedom upon which 
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the estimate of error of the aggregate plot yield is based, so the 
subdivision of the whole plots does not entail any multiplication 
of the whole-plot replications. Where subdivision of the whole 
plots is practiced, the correct type of analysis is the one involving 
two estimates of error, (a) and (5). In working out the analysis 
of variance, it is best to express all the values in the smallest 
units or subplots to which the whole plots have been subdivided. 

Example 30. Statistical Analysis of Data from a Complex 
Cotton Experiment'^ in Which Certain Treatments Appear in 
Subplot Units. — In this experiment, four sowing dates, three 
s])aciiigs, three rates of irrigation, and two rates of nitrogenous 
manuring were superimposed in all combinations giving 4 X 3 X 
3X2 or 72 different treatment types. The individual treai>“ 
ments in each series were as follows: 


Sowing date 

Spacing 

Irrigation 

Fertilizer 

1. July 24. 

IL August 11... 

III. Sept. 2 

IV. Sept 26 

(а) 25 cm. between holes 

(б) 50 cm. between holes 
(c) 75 cm. between holes 

a:: Light 
y Medium 
z Heavy 

■ ■ 1 

(0) Control 
(N) Sulphate of 
ammonia at rate 
i of 600 rotls per 

1 feddan 

1 , ^ 


The layout consisted of four blocks, each containing four large 
whole plots to accommodate the four different dates of sowing on 
a random arrangement witlnn each block. Every whole plot was 
subdivided into nine subplots to take all combinations (3 X 3) 
of the spacing and irrigation series, again located at random 
over the subplots in each whole plot. Each of the 144 subplots 
was in turn subdivided into two half subplots, one-half of each 
pair being given a nitrogenous fertilizer and the other half being 
used as a control. There were therefore in tlie experiment 16 
whole plots, 144 subplots, and 288 half subplots, entailing in 
the analysis of variance three separate estimates of error, each 
one applicable to its own particular treatment comparisons. 
Each of these tiiree sections can be regarded as an independent 
experiment and analyzed on the randomized block principle. 
For example, for the subplot treatments, the 16 whole plots are 
exactly equivalent to 16 randomized blocks. As already noted, 

* J. Agr, Set., 22: 616. 
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it ih better to work throughout in the smallest plot units and to 
draw upj in a single table, a composite analysis for the whole 
data. Tins makes it possible to assess first-, second-, and third- 
order treatment interactions and to derive the Ml benefit from 
the complex laj^out. 

To reduce Table 77 to a reasonable size, only the mean yield 
per half subplot for each of the 72 different treatment types has 
been recorded. Each of these yields represents the mean of the 
four replicates from the four large Wocks ineiiided in the experi- 
ment. Tins fact must be remembered in using Table 77 to work 
out the analysis of variance. 


Table ,7S.~-Analysis of Variance 


Total whole plot. 

Blocks 

Sowing date . . 
Error (a) 


s.s. 

Degrees 
of free- 


dom 

/i,531 

15 

l,000t 

Z 

3,129 

3 

402 

9 

7,609 

14 s 

4,531 

15 


Spacing.. . . , .... ... ... . . ... .......... 562 

Irrigation 1,022 

Interactions: Spacing X irrigation 7 

Sowing date X spacing 600 

Bowing date X irrigation 134 

Sowing X spacing X irrigation 103 

Error {b) ' 500 

Total half B7ihpht8 

Subplots, the half subplot l>iocks. . . . 7,609 

Nitrogen 6,559 

Interactions; N X sowing date 1,316 

N X Bpacing 305 

N X irrigation 300 

N X spacing X irrigation 27 

N X sowing X irrigation lOB 

N X sawing X spacing 74 

N X sowing X spacing X irrigation ... 86 

Error (c).., 552 


1 , 043 . 0 =^"' 

44.7 


281.0=^^ 

511.0. 

1.8 

110 . 0 **" 
22.3** 
'■ wB.a'. 
6.15 


6,550.0** 

43B.7** 

152.5** 

180.0** 

6.8 

IS.O** 

12.3* 

7.2 

5.11 




~ . t This valua is merely m arbitrary one pnt in to conipl^te the analysis. It cannot be 
. iov Table 777: 
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. Ab an iiln \i jtion of the way in whioh the eomponents of the 
analysis of ^ in inn have been obtained, the calculation of a few 
of tliera, is ^p|Hiuitd 

S.S. solving dale 

_ r(36S.2^+m9=_+ 401.6® + 284.7«) (1,501.4)®] ^ . 


mimgen 


Inieradion; fdt?vgen X sawing date. 
.Aggregate nitrogen anti sowing date S,S, 

['(118.6® + 249.6® + 170.0® + • 


(1,501.4) 


X 16tt 

= 11,004 
Interaction, 

nitrogen X sowing date = 11,004 — (0,559 + 3,129) 

: =,i, 3 i 6 v 

Alternatively, this interaction can be caknilated directly from 
the differences between comparable totals. 

I + II with N = 526.5 III + I¥ with N = 396.0 

I + 11 with 0 = 288.6 HI + IV with 0 = 290.3 

Difference Hi = 237.9 Ha = lOS]? : 

I + IV with N = 405.8 11 + HI with N = 516.7 

I + IV TOl;h. o ' = 247.1 ■ H + 111 with 0 + 331.8 ; ■ 

^ : Ha = iE .7 >. ■ ^ ^ H 4 = Isii ^ : 

1+ III with N = 489.4 , : ; :H +,IV witlv N =,+33.1;^ , 

I + HI with 0 = 280+ H + IV with 0 = 298+ 

' -H6;=':^.0 ' ■ H«'=:434.6\-".>.-': 

V"Hi- Hr=l+32.2l;;'' .V 
- Ds ~ Di = - 20.2 ■ ./ 

■V ' > lv Hb.-;Hb;= ; 74.4V . . 

S.S. interaction, nitrogen X sowing date = 

(132.2® + 26.2® + 74.4®) X 16tt , 

V -VI 

'.. "ttTliiB factor m necessary to compensate for data tabulated as mean 
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A similar technique might be used to calculate the higher order 
interactions, but where a number of factors are involved, it is 
simpler to use the former method in which the aggregate sum of 
squares for the factors separately is subtracted from the total 
treatment sum of squares for all combinations of the factors. 

Thus, S.S. interaction, sowing date X spacing X irrigation— 
[31.42 + 34.82 + . . . 33 2^ + 26.02 1,501.42] _ 

__ ... .j X 10 

( 3,129 + 562 + 1,022 + 7 + 660 + 134 ) = 103 

The following r^^siime of the cliief results has been taken 
verbatim from the original article. 

Yield both with and without nitrogen application has optional value 
for August sowing. 

The returns in yield for nitrogen application decline with advancing 
sowing date. 

Spacing has little effect with early sowing but has large effect with late 
sowing, irrespective of nitrogen application. 

Water supply with early sowing and nitrogen application has large 
effect. 

Water supply with early sowing without nitrogen has little effect. 

The effect of water supply tends to disappear with advancing sowing 
date irrespective of nitrogen application. Various combinations of 
factors may be utilised to give maximal jleid, thus giving considerabie 
latitude in sowing date without sacriffce of yield. The inter-relations 
of the factors studied indicate the limits between which, by suitable 
practice, the peid of cotton may be improved or controlled. 

The results arc therefore both comprehensive and conclusive 
and bear witness to the advantages of complex expciilmentation 
wliere both the field and laboratory control is sufficiently skilled. 

ANALYSIS OF COVARIANCE IN FIELD EXPERIMENTS 

One of the chief difficulties in obtaining conclusive results from 
field trials is the impossibility of finding even approximately 
uniform plots. No matter how technically perfect the design 
of an experiment may be, there will still remain very definite 
differences from plot to plot in soil and environmental fertility, 
in germination, in disease and pest incidence, and in the ulti- 
mate plant population from which the yield data are recorded. 
Modern methods of layout have greatly reduced the effects of this 
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heterogeneity on the final interpretation of the results* ,Eten 
today/ ,ho\Teverj: in <3xperiment.s in which the general design and 
exeeiitioii are bc^yond reproach, the plot variation is often mffi- 
eieiitiy great to mask certain real 'differences between the 
treatixieiits under coiiiparison. In many experiments this plot 
variation 'is' obvio'uslj' correlated with certain external, fact, om as 
pla:iit popiilatiorp soil fertility*, age of tlie cwop, nuirilxn* of tilli^rs, 
etc. When,' a fair estirmite of tin? eoefficif,nit'of eorrehition between 
the yield dato. and t!ie'partieuJ,ar externa! factor infiiieneing results 
can be ' computed, it may be possible to use .tliis information to 
prodiicf? a valid rediKdloii in the estiinate of f^xperimental error , 
and to 'demonstrate real treatment, differeB,ces that would other- 
wise have been swamped in plot vari^ibillty. .For .example, 
preliminary uniformity trialB might be used to assess the fertility, 
values of ■certain plots required for subsequent experiments.' : On- 
the assumption that these values do.iiot^ change greatly for the 
succeeding experimental crop; by the analysis of covariance- it 
is possible to use, the data from the uniformity trial to adjust 
theydekls in the- experiment so as to 'Compensate for' soil fertility 
differences between the plots. .A valid statistical C0mpariso',ii 
of the corrected 'treatment' yields^ can dhen be carried out and,.ai:i- 
accurate; estimate of their respective merits obtained. Similarly', 
many ::experim'ents; are spoiled' because, owing' to uncontrollable';: 
envircninieiita] factors,, the' plant population' is far .from, miiform. 
It is often possible, to 3nake',a, count of the number of plants per; 
plot and, to use ,this: to 'correct 'the yieidS' for .population. \ 'It is' 
not sufficient merely tO'divide the' yieki by tlie ;plant number, 'as, ' 
of course, .widely 'spaced; plants. /develop' very 'differently' :„fr 0 ot, 
'■closely'.^spaced'yuies/and based 'directlyo:)!! /the yield 

|:d,aiit,'Woiildd:;H3d>iased';,;ii:cfavor ofidie thiiily p(>pulated/treatmente./. 
Thv slatislifxal tn^atuieni. <lepends on the us(} of the regressif)!! 
'coeffieient'io'uie'fermine 'thcraverage yield. that might'te expected' ' 
'for'.any 'given 'u, umber of plants and .o'n the dispersion of: the : treat-^/ 
.ment . means relatixm to this ,finea:ir: regression,' , .In.dhe .f'Ollo^ring , 
exarnpk^s, it is assumed that the reader is familiar with the 
elementary facts relative to the ealeulation and significance of the 
coefficient of regression and to its application in a simple analysis 
of'CoMriancams/detailed'iniGlia^ 

Example 37. Analysis of Covariance Applied to a Cotton 
Varietal Test.— In the following small experiment, five rows of 
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each of three selected varieties of cotton were sown. The rows 
were arranged on the randomizied block principle. The quantity 
of seed of each variety available was limited, so that it was impos- 
sible to equalise the plant number per row and make up for 
defieioncies ill germination, mortality incidence, etc, A count 
of tiie iiimibBT of plants per row surviving at harvest was accord- 
ingly taken, and the resiiltB are appended. 



-POPXJX,ATJON NtoBER AND YiELD OF CoTOON LlNT IN OUNCES 
PEE Row- 


No. of plants per row 


Variety 

mean 


Number of block 


Variety 

total 


16 12 13 10 8 I 50 I 11.8 

12 14 18 15 10 60 I 13.8 

7 ^ 13 10 _ 7 6 43 I 8.6 

Ziflif 4r 32"“"24 I Grmidtotal m" 


Row OT block total 


Consider first the a; analysis of variance of the yield data 
alone. The variances for variety and error are, respectively, 11.4 
and 3.74, giving a calculated value of F = 3.05. The cor- 
responding reading from the Table of F at the 5 per cent level 
is 4.46, proving that, for the yield data alone, there is no signifi- 
cant difference between the three varieties. There is, however, 
considerable variation in the nxunber of plants per plot, which 



Number of bloc] 

k 

^ ^ Variety 

mean 

Variety 

Vnrioty 

I 

II 

III 

IV 

V 

total 

, A 

11 

8 

9 

6 

' 5 

39 7.8 

B 

■ 5 

6 

10 

L’S 

4 

33 6.6 

c 

4. 

: 7 

6 


4' 

24 4.8 

Row or block total 

.....20 

. 

21 

25 

17 

13 

A' ■ 

Grand total 06 





cc>»es 

COOiffi 

ivci 
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presumably is partly responsible for the high estimate of the 
error of x. It is proposed therefore, to carry out an analysis 
of covariance, so as to determine whetlier the varieties are 
significantly different when the mean yields are adjusted on a 
basis equalising the number of plants per plot. The first step 
is to calculate the coefficiemt of regression of 3 rield on plant 
numl)er, and verify that it is significant. The best estimate of 
the coefficient of regression conies from the error line of the 
analysis given in Table 78, t.c., after tlie variety and block 
effects have been eliminated. 


S.P. ^ 

STsry 


36.JI ^ 

■47.2 ■ 


0.7648 


The significance of 6^,; may be determined by calculating the 
standard error and comparing the calculated value of t with 
the appropriate reading from the Table of t. Here, the alterna- 
tive method by splitting up the error variance of x to its two 
components — the linear regression and deviations from this 
regression- — and reference to the Table of F is probably easier. 

^ (S.P. xy)^ 

Linear regression S.h. — 

. o.D. y 

■ ■ an 12 ■ ■ ■ 

= = 27.6 with 1 degree of freedom 

S.S. deviations from regression = 29.9 — 27.6 — 2.3 with 

7 degrees of freedom 

^ p _ 27.6 _ . 

^ ^ "2.3/7;-^ \ 


The reading from the table for tii 1, — 7, and P — 0.01 is 

only 12.25, proving that the coefficient of regression of yield on 
plant number is highly significant. The analysis of the reduced 
variance may now be validly applied to adjust the yield data for 
variation in plant number and increase the accuracy of the 
statistical evaluation. This final analysis is limited to the 
treatment and error components, as it may fairly be assumed that 
the block effects have already been taken out in the original 
analysis of Table 80. The procedure then becomes identical 
with that already given in Table 62 (Chap. VI). 



■$. 
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TABTiB Si.— -A h ALY srs OF ^Eki>ucb» Variance' ■ 


Factor 

S.S. X 

j- . 

S.S. 1/IS.P. a:?/ 
1' 

1 

_ [ 

Xix-XP- 

■ 

.Degrees 
of free- 
dom 

Re- 

duced 

variance 

Vaiiaij" 

22. S 

08 . 8 

27.6 

11.73 

4..' 

.11.73 

Error.-. 

Total (vanety 1 

■29.9 

47.2 

36., 1 

2.29 

l' ^ 7 

■'OJB 

-f oiTor") .... 

r52.7 

,11,(00 

03.7 

■ _ 17. .73 

. -Jl. '■ 




' I 

leHicli'ia] 

jQs r7''dT71 

i 

S.71 


llie F 'test sliows that tlie 'residual variance conipareci with 
that of error ' is significant on.' a probability approaching 0.01. 
Iliis prove^s tliat there is a significant difference betw<3ei:i the 
variety m,eans corrected for plant number. The corrected mean 
values must now, be calculated from the ' 

notation being that previously used in Chap* VI. In , this 
.exainple. 'is 0.7648, as already calculated. ' . 

Ta'blb 82. — Table OF Yielbs Gorebcteb for Plant H timber 


Mean, rio. 
of plants' 

: ■ ; (M - : ' 

{yt "" My) 

‘I'l^xviyit Mjf} 

Mean ; 

." yield '■ 

(®i) 

Corrected 

yield 

11.8 ■' 

"|“0.4 

+ 0.306 

■ ■ : ,7.8 . '. ' 

7.49 

. .,'1.3:8 /■ 

,+2.4,',^ 

+ 1 . 835 

■' 

4.77 

0 ' ;'8'..6 ■ 

' ■ :-2,S . " 
!" , 

- 2.141 


6.94 


The standard error .of the. difference '©'between any pair; of - these. 

■ h..,. : yra--— ---g2— -V ; 


correcied mean vi(‘Ids 


^ ;?/ for,' error J 


/wtere. error vamiich of , 

n — number of ])hhs from wliicb each mean is ealetilated. 
ffifference; between the, corresponding pair of means' 
for plant number. 

Storidard' error a; 

h.'l(bycaieulatioii)^'=^^^ ==: g.80 ', 


I 
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Reference to the Table of # for n = 7 (the degrees of freedom of 
the reduced error variance) shows tlus value to be significant 
on a probability less than 0.01, proving that variety is a 
better yielder than B when allowance is made for differences in 
plant population. 

Testing now, d -- C; 


t 


0.55 


^ 0 . 33 ^ 1 + 


= 1.21 


and for C — B; 


2.17 




3,81 


For 7 degrees of freedom, the first of these two values of t 
is nonsignificant, and the second is significant on a probability 
approaching 0.01* 

In conclusion, therefore, it can be stated that when conditions 
are equalized as regards the population factor both the A and the 
C varieties are markedly superior in yield to B. This is a 
strikingly different result from that obtained from a straight- 
forward analysis of variance of the yield data alone. Not 
only is a negative result changed into a j)ositive one, but the 
relative position of the three varieties is very different from w^hat 
might be anticipated from the actual mean varietal yields 
(Table 82)* 

It IB possibly of interest to show how the expression for the 
standard erroT has been derived. D represents the difference 
between the actual mean yields of the two varieties less 
X the difference between the corresponding pair of means for 

2E 

plant number. The first part of the standard error ” is the 

variance of the first component of D, and the second part 
X E 

the variance of the second component. The standard 

error of the difference between these two components is therefore 
equivalent to the square root of the sum of these two variances, 
the standard error of D is given by the formula quoted. 
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TOIFOEMITY TRIAL AS A CONTROL'. OF PLOT TARIATION; Z' 

' With orehard and perennial crops in general.^ genetical Iietero-,' 
geneitjr^ seasonal fiuct.natioiuSj age differeiicesj the . ' limited , 
number of indivicluals on each plot^ find the iiiiiivoidable extensive ', 
aereage of' any large-scale experiment » all combiiie. to^ make the 
in:iCC>,iitrolhib!e variabi,iity between units an even more serious.; 
Iiiridra..iiee to ■ s'uecessfiil yitdd trials than in tlie e-sse of amiual,' 

Tiunas S3,— YiEia>' of Cahif m IIuNDEEnwEiGHrs per d^-io“ACRK Feo’3^ 


Ii.at,oorj crop' fx) 


0 I M / 



Plant (i/) 


O \ M l MI/' 


Tr(‘atmt!iit total, . 


'I’raatTaejit mean . 


27 4 {^ 40 


■ Grand 1 . 

total I 

712 238 280 172 232 


jGenCTaljl 


33.0 I 47. (>j 50. Oj 34.4 40. 4 

_^i L_i 


OtandV ' 
total 
022 

General 

mean 

40.1 


■'■'Crops; ■ ■ 'With’', .perennials,, preliminary ■ um ' .trials^; as. '.a;' 

means <if tislhnaling tia^ relative feriiliiy values of the ultimate 
experimental, units nmy often l:ie 'utilis^ed': tomoiisiclerable advati*'; 
tagox Tl'ic above data from a rnanurial; €K|:)eri,meii.t 'witli sugar 
('ane provide an excellt?nt iliiistraikm of this, the- plant cane crop . 
Ixnng used to mc^asure the potential fertility of the plots and 
the mean ^delds of the various tre^atments given io the ratoon 
crop being adjusted accordingly* The fertiliisers applied were 
farmyard manure and a complete artificial in all combinations 
of the twT> rates of dressings 0 and 1, resulting in the following 
rfour'''treatmenls:;;'.:-^ 

\ ';:0:';«5-e0ntr0l:r'i]diertiM25er','hppIW 

M — farmyard manure at the rate of 20 tons per acre- 


ililil. 

liilil 

iifi 

ilii 

11 




liiil 

11 


ii 

Jlli' 
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I === complete inorganic fertilizer at the rate of 90 pounds 
N, 375 pounds P 2 O 5 , and 50 pounds K 2 O per acre, 

MI — combined farmyard and morganic fertilizers at the 
rates quoted above. 

The yield of cane for the two crops in hundredweights per 
plot and the analyses of variance and covariance are 
appended. 

The analjnsis of xmriance of .r was first used to test whether 
there was any significant difference between the treatment, 
means of the experimental crop. The biggest treatment variance 


Table 84.— Analysis of Variance and .Covariance 


Factor I 

Degrees ; 
of 

freedom 

Analysis of 
variance 
of X 

Analy.si.s nf 
variance 
of y 

Ana].ysi,s of 
covariance, 
xy 

S.S. 

Mean 

square 

S.S. 

Mean 

square 

S.P. 

Mean 

S.P. 







4- 


Total. 

19 

2,4S8.S 


4,303.8 


2,095.8 


Blocks, . .... . . , , . . 

4 

36S.8 


505. S 


395.0 


Treatments: 








Farmyard manure. 

1 

332.8 

352.8 

520. 2t! 

520.2 

428.4 

428.4 

Inorganics. ............. 

i 

245.0 

245.0 

649. 8t 

649.8 

399.0 

399,0 

Interaction / X M 

I , 

SS.8 

33.8 I 

16. 2t 

,16.2 

23.4 

23.4 

Error.,. ' . , 

. 12 ' 

1.488.4 

.124.0 

2,611.8; 

217. 7~ 

1,694.8 

141 . '2 


t For the plant cane crop, the grouping into treatments is, of course, imaginary, but it in 
better to carry out the analysis so as to develop parallel series for a;, 3 /, and x^f. 


is that for the response to farmyard manure. This gives a calcu- 
lated v^llue of z of 0.5227 as compared with the reading from the 
Table of ^ (for ni ^ 1 , and n- ^ 12 , and P 0.05) of 0.778S. 
The treatment means must therefore be n^garded as not signifi- 
cantly different even though the mean values show considerable 
variation, especially when the no manure treatment is compared 
mth the others. The explanation of this apparently lies in the 
relatively high error variance as a consequence of the large varia- 
tion between the yields of similar plots. It was therefore decided 
to use the regression of x on y, i.e., of the ratoon crop yields 
corrected in accordance with the yields recorded in the uniformity 
trial or plant cane crop. The first step is, of course, to test the 
significance of the regression coefficient. 
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611 , (from error variance) = = +0.649 

■ ■ ■ >is.0il.o 

, /i:480~“^ae9i^X^TL8^ 

standard arror ol rnTpirs” 

— 0.117 (11 degrees of freedom) ■ 

I (bj ealculation) , — [j'IyI 


B.B. 5? S.B. p IS.F. xy\ 


Degreessof ■ .. .. 
■X}» freedom 

z(x - xy 


Farmyard manure ' :352 . 8 . 520 .,2 42B , 4 , 0 
Error.,. .... ... F:4SS.4 2 Jll ,8 1,694.8 ' '388.4 ' 

■ . Total . l,S41..237l327oi7i28.2:^ 462.2 ' 

, Itoidual 13.8 


■■■ ,Tl;ie ;re<iiiced^ variaiH^es' for the and error ;Compcmeiit8 

'are not signi!h?aut.ly different, proving t.hat there is no apparent 
response, todbe'- dressing of f arm jard- ihaimre^ even,’ w hen; tlKV data 
from \;4:lie ;■ uniformity' trial are ; nsed ■ to: provide' ; an,' 'OqualiKed 
estimates of the in^aliruint means. As already explaiiieci in 
Ghap;' ^¥1,.; with, only 'two. treatments, the redneed, variance' 
the first line of the above table is bound to be aiero and need not 
be calcMilated'. 

■ F (Table ,86) is 23,55, a value which is significant cm a proba- 
■' bi,Iity .less;' thin^^',^ 1 ' ' ::The, error variance. ;can Therefore b'e,- used;. fo 
compare the corrected mean values for the 10 
. vlO pl(>ts without :iiiorgaBies;: .. ; 


Tabub 85 ,— A]srA.Lysis or Keuijceu Vakiancb .foe Faem-yaed- Mahube- 

Main Effect 


whicli correspo'nds to a proiiability conBiderai>Iy' leB.s tii'an, 0.01. 
Therefore, after du.e alio^vanca? ha.,s been made for. tr('3atment and 
block effects, there is a marked ]-)cjsilive eorrelatioii betwcc>ii the 
yields of the ■individual plots in the ratoon and plant cane crops. 
It' is now' permissible to use the (.r .X)‘^ nnalysi^ to determine 

■a'ny significant di.fferenecB between the various treatments. 

Where the treatments are coniplex,' as in' this ■ experiment, 
it .is ad'visable to carry out the (:r — Xy analysis for each type 
of comparison, and 'interaction separately , 
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Tabm 86.— Analysis OF Rb0itced ' Vaeiance FOE Inorganic Manitres— ■ 

Main Effect 


Factor 

S.S. ® i 

S.S. ?/ 

1 

S.P. xy 

nix — X)* 

Degrees 
of free- 
dom 

Mean 

square 

. j 

Inorganics 

Error.; ..... . . , 

245.0^ 

1,488.4 

649.8 

2,611.8 

\ 399.0 
1,694,8 

388.4 

11 

35.3 

1?otal. . . . . . ... . . 

Residual,.....; 


3,261.6 

1,295.0 

t . 

. 

1,219.7 
■ 831.3' 

1 

831.3 


Ta3L33 87 


Treatment 

Xt 


Corrected 

yield 

■ "" i 

No inorganics .... 

32.1 

0.649(61.8 - 46.1) 

28.4 

With inorganics .j 

39.1 

0.649(40.4 - 46.1) 

42.8 

Difference, 



14.4 


i , ' 




Standard error of difference between c orrected yield s ^ 

-t- 2 , 611 . 8 / 


14.4 


2.97 


t (by calculation) = = 4.85 (11 degrees of freedom) 


This \"ahie of t is significant on a probability less than 0.01 . 
Actuall 3 ?'j as there are only two treatment means involved, the 
t test is redundant, since it merely represents another method of 
arriving at exactly tlie same result as already obtained by 
calculating F, 

A similar analysis for the interatition of I X M shows it to bo 
nonsignificant. The final conclusion w’^ould therefore be that, 
when the original fertility of the plots is equali^ied, the application 
of inorganics to the ratoon crop has produced a significant 


increase in yield. The use of farmyard manure, alone or in 



combination with artificial fertilisers is of no benefit toward 
increasing the yield. Of course, it is possible that the effect 
of the farmyard manure may not make itself felt till the second 
ratoon crop, and further yield data would be necessary to test 
this point. 






ill 
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In this examplej therefore, the 'iiae of the tlata from' the inii*- 
formity trial has changed a negative result into a positive one , 
and has made it possible to scdect one out of the four alternative 
dressings tested as being mneli the' best. .The aetiial :and.cK)r- 
.rented mean, yields for the four treatments are d^abiilateddaelow;, ; 
and they ejEfectively illustrate how this has happened.-. 


■ ■ Tri;ia,tinc‘.rit 

Mejui 

yield. 

. .Corresjted ■ . 
nieari yiekli.' 
ewt. per' plot 

Goiitrol: no inaiuiro 1 

26. B 

25.6'' ■' 

Fo,rmvard inuiture/.. . . J 

37.6 

ai.2^ 

Inorj^iuiie fertiliiser. . . . . 

36. 2- ^ 

. -43. S' . 

Farmyard manure + iiiorgaiu(3 fertllis^er 

42.0 

■ 41.8 ■ . 


LINEAR REGRESSION COMPONENT OF THE 
■■ TREATMENT VARIANCE 

In crop experiments the treatments are often quantitative in 
character and represent different rates of a certain factor on 
some regular incremental scale, such as jzero, single, double, 
and treble quantities of a certain fertiliiser. When this occurs, 
it is possible to segregate the linear regression component of 
the treatment variance. This component is a measure of the 
general effect on the crop of the increasing doses of this particular 
treatment factor. If th.e response is sufficiently definite and 
' ' uniform, the; ■ regreHsioii and error :variaiiees ' will^; be- significantly. ■ 
different, as detcu’mined l>y evaluating F or z. When the n'sponse 
' is' regular and' the 'differe’ncc; between the.. tr(mtnient:; means i 
very great, it k even possible for tlje rc-gre^ssion variaitc,e to show 
significance, when the F te>st ap])lied to the treatment variance 
as a wffiole has given a negatives result, Thi3 billowing example 
effectively illustrates the application of this metliod to yield 
data obtained from a manurial experiment witli sugar cane. 

The reading of z from the table for tlie 5 per cent distribtition 
is 0.6250, so that on this basis of comparison there is no Bignificant 
response to the manures applied. Examination of the. treatment 
totals shows a definite increase in yield from the control up to 
the heaviest dressing of farmyard manure, and the effects of tlie 
manures have evidently been swamped by plot variability. As a 
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Table 88. — Yield of Plant Cane in Half-hundredweights per 
Jso-acre Plot around an Assumed Mean op 40 Half-cwt. 


Blocks 1 

Dressing of farmyard manure, tons 

1 per aei'e 

Block total 

” .1 

10 

20 

30 


- 4 ; 

-- 4“ 

- 4 

— 4 » 

- + 

1 

12 

5 

0 i 

7 

24 

2 . 

1. '4 ' 1 

2 ^ 

3 

1 

4 

■ a ■ 

S 

3 

3 

0 

S 

4 . 

1 

5 , 1 

6 

7 

9 

5 1 

. 6 

7 

7 

, .9. 

29 

'lYcatinciit total . .jj 

^17 

1 -{■••2 

+7 { 

4-19 

Grand total 42 


Tabi.e 89. — Analysis of Variance of Yield Data 


Factor 

. as. 

Degrees 
of free- 
dom 

Variance 

i , . variance 

10 

Total 

655.8 

19 

1 


Blocks . . .... . . . . . ... . 

394.3 

; ;'4 , 



Treatment . . . 

88.2 

' : 3 : ' 

29.4 

0.5392? 

Error. ............... 

173.3 

. 12 

■:i4-.4 

o:i823r ~ 



more exaet test of the mamirial response, it is proposed to calcu- 
late the regression of the treatment yields (ir) on quantity of 
manure applied (y)^ the dressings being taken as 0, 1, 2, and 3 
units, and allowance being made for tlie fact that each treatment 
total represents five plot yields. 

1 . . . 

+'(3;xi0) 6X2 _ 

QQ r • (S,l\xyy 8.62 __ 

S.S. linear regression — ~ __ == 73.96 

The reading of z for the 5 per cent distribution is 0.7788. 
The regression variance is therefore significant, proving that 
there has been a definite response to the farmyard manure; 
the heavier the dressing of manure, the greater the yield of cane. 
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Table '90.— Detailed Analysis 

Factor S.S. i Vari- | T , ^ 


Deviations 
Error. . . , . . 


' as. . 

Degrees of 
freedom 

Varb 

ance 

1 . ' 'variance . 

6.55.8 

19 



394 , 3 

4 - § 



73.9 

14.3 

■173.3 

n 1-^ 

2i S 
:i2 ^ 

73.!) 
i 7.1 
14.4 1 

l.OOOl) 

[■3 = 0.8178 
0.1823) 


CONFOUNDING OF TREATMENT EFFECTS 

At the end of Cho;p. thc^ need for an orihof^onal design 
in agricultural experinicuiis was rnnphaHized, and it was shown 
tliat, when this })rinciple was not oliservcKl, it was possible for 
certain treatment <‘ffects to become caii angled with one another 
in a way that considerably complicates th(i statistical analysis 
of the data. Confoundwg in fiekl experiments is a term used to 
define a plot arrangement in which a poriion of the less important 
treatnient effects — usually the higher order interactions — 
is purposely confounded or entangled with that of blocks. It 
really represents a controlled deviation from the standard 
experimental designs. Confounding is only practicable in rela- 
tively complex factorial experiments embracing several different 
problems concurrently. The tt‘chmque consists of .splitting up 
each block into so man}’’ equal subblocks and allocating the 
various treatment combinations to those siilil)locks in a way that 
ensures that certain unimportant troaianent effrads are entangled 
or included in the subblock variance. For any given factorial 
(iornbination the treatment's may l)e allocatcal to the snhhiocks 
in only a fewv alternative ways in ordcT to confound any sflected 
treatment effect. Incorrccti allocation to the sulililocks will 
result in a nonortiiogonal layouf, which may compleduly upset 
the results. Yates'^ lias enumerated the possible alternative 
subblock arrangements in order to achieve the confounding of 
specified treatment effects in certain standard types of experi- 
ment. One of tlie simplest forms of experiment in w^hich con- 
founding is practicable is the one in which there are 2 X 2 X 2 or 

* The Design and Analysis of Factorkl Experiments, Imp, Bur, Soil ScL 
Tech, Comm, 35. 
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2® treatment series, e.g., in a trial including two \%arieties Ai and 
Ai, two spaeings Bi and Bi, and two dates of sowing Ci and C 2 . 
If no confounding was intioduct'd, the number of plots in a block 
would be eight to cover the eight possible treatment combina- 
tions, viz.: 

(1) A, Bi Cl 

( 2 ) Ai Bi Ci . 

( 4 ) Ai Bi Ci 


(5) A I B, Cl 

( 7 ) " 4 ! c“ 

( 8 ) aIbICo 

If the experiment consisted of four such blocks of eight plots, 
the skeleton analysis of variance would be as follows: 


Degrees of 

Factor freedom 

Total 31 

Blocks 3 

Treatments: 

Main Effects: Variety (/I) 1\ 

Spacing {B) 1 1 

Sowing Date (C) l/ 

Interactions: 1st order. A X B 1>7 

AXC ll 

BXC 1) 

2d order. AXBXC 1/ 

Error 21 


If, in the same experiment, the blocks were each subdivided to 
two subblocks of four plots each to take treatments 1 to 4, and 
6 to 8, respectively, the second-order interaction, A X B X C, 
becomes completely confounded with the subblock sum of 
squares, and the appropriate analysis would then become as 
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Total.. '.v.. 

Blocks, siibbioeks'-. . . 
Trcaitiijeiits:. 

^ Main Effects: Yarioiy i 
.Spacing' 
Sowing ( 
Interactions: .4 X .E . . , 


B.X C..:. ,1/ 

Error.'.'. ... 'IB : . . 

By a cMEereiii ailocatioii':©! the treatments l)efcweeTi the 'pairs, 
of, subblocks, it i.s poss'ible to eon friiind either tlieil X-B^ the 
B X n or tlie .4 X C ■ first-order intenictions with, the block,s. 
instead of the seeoiKhorder one A.X B X C* ■ There wotdcl 'the,ii 
be 3 degrees of freed.om for the treatmtuit' c^omparisons, 

two for the uiieoiifounded' fii*Bt-order .interactions' and one. for' 
the secwmd-'Order iiitfU’aetra^ 

. ' ; The .alternative groupings are appended. ' 


Fac tor conf ou o ded 


Subldoek a Babblock h 


A. X €> interaction 


A .X, It i,Bieracti' 0 .j:i 


B X 0 'interact;!, on 


Partial Confounding.— In any 2''^ confounded experiment, it will 
'bo necessary to include several replications of each treatment, 
series in. order to provide an error variance based, on an adequate, 
number of degrees of freedom. In the example cited, tiiere arO' 
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four replications of each treatment combination, arranged in 
four pairs of subblocks. In each pair or group of subblocks, 
each treatment series will occur only once. It is valid and 
sometimes advantageoUvS to adopt the practice of partial con- 
founding in w^'hich the separate subblock groups are used to con- 
found different treatment effects. In a 2*^ confounded experiment 
with four complete replicatioiiB or eight siibblocks in four 
pairs, a partial confounded arrangement which gives a nicely 
balanced design is one in which each type of interaction— 
A X A X C, B X C, A X B X C— is confounded in a 

different pair of subblocks. There w'ill then be 1 degree of 
freedom availa])Ie for each main effect and each interaction, 
leaving 17 degrees of freedom for error. In computing any 
particular interaction for an experiment of this type, the plot 
yields from the subblock pair with which this interaction is 
confounded are ignored, and the data from the remaining six 
subbiocks are utilized to assess the interaction. It is important 
to allow for this fact in the ultimate calculation of the standard 
errors of the various treatment series. The replications attribut- 
able to the treatment means from which any interaction has 
been calculated will be only three-quarters of that of an uncon- 
founded experiment of the same type. In this example, for 
the main effects which are unconfounded, there wdll be 16 replica- 
tions but for the first-order interactions only 6 instead of 8 and 
for the second-order interaction A X B X C three instead of 
four replications. 

The 3X3X3 Confounded Experiment. — ^The 2X2X2 
experiment suffers from the obvious disadvantage that all the 
treatment effects in the analysis of variance are derived from a 
single degree of freedom, and in consequence, only large treat- 
ment differences arc likely to be significant. It is, furthermore, 
true that, in any experiment in which a particular factor is 
included at only two levels, there is a very small chance of 
obtaining any accurate idea of the optimum level for this particu- 
lar factor. For this purpose at least three levels of each factor 
are required, and in experiments in which, great accuracy is 
aimed at, an even wider range of values may be included. For 
these reasons, the 3X3X3 or 3^ factorial experiment, involving 
27 treatment combinations, must be regarded as decidedly 
superior to the 2® design. On a nonconfounded layout, a 
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tF* design makes it :iiee€ssai'}’'' fof eaeh block; to liave 27 plota. 
Such large blocks lire generally, .far frcmi iinifomi^j and. they t^eiid 
to obviate^ mticli 'of the advantage of the randoimssed block 
armiigement as a’ means of reducing the effects of plot .hetero- 
geneity ill' the statistical cnaiJiiation ''of . f.he resnlts. ■ This dk-' 
advantage can. be overcome if ei"e:ry block is-' split .np lnto thpie ' 
si.ibblcK:*ks of' 111.110 plots eadp so as tcconfoioid part of' t lie second- 
order inteniction. There are four alternative ways of allocating 
the treatments, to the subblocks 'in order, to obtain the; rec|iiired 
df?gree of confounding. One of tliese selected at i:‘a«€!om is' given 
below. This allocation was used i,ii a tomato maiiurial/experi-^ 
meat in wliicli tlio fert-ilizera, testi^al. ■ weic sulp}iati.r of amniooia,' 
sulphate of potash, and superplios|:)liate, 'each at three rates of 
applieatio.ii5 rue. Of 1, and. 2 luiudred weights per acre. . 

Subblock a Siibbloek'l) Subblock e 

(1) 'Ng Ki Fs' .' (10) 'Ni. Ks Po ' (i9)-N2 K. Pi' ' 

(2) No Ki Pi (11) NV Ko Po " ' (20) No K, Pt' 

, :(3') Ni Ko P2 (12) N, K, P, ■(21).N2 Ko Ps ' ' 

: (4) Ni.K2,Pi (13) No'Ko'Pg ^ (22) 'N3„'Ki:Pa'' : , 

■ , ( 5 ) No Ko:Po ■ (14) No vKi Po:: ^ (23) Nj ' 

. to). Ns Ks Po ■ : : (15) ;Nir'Kg Pi \ ■■ (24) No Ks'-Po^ 

' (7) ■ No' Ko :P2\ ■ ' '.(16) 'NiVKi. P^ . .■^'■{25)f:No: 'Ki -'P^ 

- 'm Pi ■ (17):'Ni,Ko- Pit^' ^ 

; : : ^ (9); ;Ni; 'K| .;Po^^'y ' 

Table 91 .-— "Yield op To.MAroEs is 'Basebts; ( 1 : 0 'LBi.); pee' Mo-AcrBEpLor 

: Tteoi^ ." SuhMorlil v'B f^ubhlqek' 

'. :: Lm :}: la r IPr {: t^pe . lb: IB' . ' . :.|e.j;:i:ie - 


Bubhlock ioial 


Grand total 4’^B 


232 


TECimiQUM IN AGRICULTURAL RESEARCH 



There were 54 plots in all iii the experiment, giving the equiva- 
lent of two blocks (I and 11) of 27 plots each to take the 27 
treatment combinations. Each large block was subdivided 
into three subblocks a, 5, and c, and the 27 treatments allocated 
to the subblocks as shown above. The arrangement of the nine 
treatments within any one snbblock %vas, of course, a random one. 

■ Tablb DS.— Tebatmbnt Totals 



N,,' 

Ni 

N,, 

KfMid 



PiV 

"K total 

K„ 

■ss 

59 

,3S' 

IBS 

, IV(, 45 

49 

41 

■ ISO ■■ 

K, 

42 

59 

,,46 

w 

■ 'K - 56' 

^ 42 ■ 

49 

147 

. K,' '■ 

' m 

56 

45 

im 

, ■ ■ 'Ks " " 54 

' 47 

49 

150 , 

Nlotoi 


■ 1 

m 


R total . . .... lot) 

'^88 

189 

L- 



Po 

Pi 

Po 

N total 

No 

45 

39 

34 

118 

Nr 

69 

48 

67 

m 

No 

'41; 

51 

38 

ISO 

P total 

155 

188 

1S9 

4SS 


If the large blocks I and II had not been subdivided so as to 
confound part of the second-order interaction, the allocation of 
the total 53 degrees of freedom would have been as follows: 

Factor Degrees of freedom 

Total 53 

Blocks 1 

vMaimetfectsr Nv 


Interactions: 

1st order: N X K. . 


2 >6 for main effects 

2) 


N X P 4p2 for Ist-order interactions 

K X P 4 ) 

2d order: N X P X K 8 

i ‘ On the confounded arrangement, one-quarter of the second- 
, order interaction degrees of freedom are completely entangled 
, with the blocks. This leaves a balance of 24 degrees of freedom 











* at 5 {Hit- tieuf, jKiint. 

Bigaifieaot ,at. 1 


T}i<^ quanl)t<y ijf ain! thr^ itilrruetion of iiii,rop^c*Ti atid 

'j)l)d8|:dHite^::aro.:1;hci/,oa:ly A 

diirereneed:)6tween/tldi throe ' ’26,75 

is ; ■ sigoifi'eaat jV:- pro'uiiig ; Itmt ■ : ;tlio 

giTBi'i Mglier' yields eotitrpl or^tllo^ doPbte ^ &■ 

:,tIi6''^;J^ .test;: lor’^ tliO:; N-;X:f ;;''iiit^ 

grealxT than Vif?” X 6 X 2.042 or 15.44 between any of 
the nine N X P totals Is significant, Eoferenee to the N X P 
treatment table shows that in this experiment the response to the 
single dressing of nitrogen is better than to the double dressing 
except in, the presence of a light application of phosphatie 'manure. 
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for treatments. Tlie ■ mmi/tmBimmt: effects: tmd iii^st-order' 
interactions ;'are not altered .and' can be cakiiiated in the -usual 
way.' Ill iBost experimeiits 'of this type,- the tuiiconfouiidad 
portion of ; the second-order , iiiteraction— 6 ■ degrees of freedom- 
can be bulked ill with those for - error - witlioiit ■serioiis loss , .of 
inforoiatioo. This is peri:ii.is8i!,>le as it is milj m cxceptioii aliases 
tl:ie seco'rici’-orc.fer 'Iiitorafdi,on will be'si.giiificajit. ' ■ If, for aBy: 
reason, it is considered' lessen tiaJ: to' evaluate tJiis - factor,' it, can .be 
calculated, but -as the caleiilation is somewhat iiivolved,: it is not' 
proposed to attempt to describe tiie 'technkiiari, iiere.' ' The 
appended analysis of varianc'e cam iliercfore foe considered’ ade* 
qiiate for am accurate 'intorpndiitjon of results. 


TA,BnE 93 .'— Analys'IB' of Vaeiance 


Factor , ' , 

S.. 8 . 

Degrees of 
fn*edoro 

Vartmce 

Total.:. 

422.0 

:-j 


Blocks, 'kc.. &iibl>!ocks , . . , . . 

41 .0 

0 


Treat ments r 

'.Main effects: . v . . : , . 

137.3"' 


08.65*'**' 

. ’■. 'K:,. 

^ 7.0^'- 

:.V'2 .-'-.I 

3.50 


. ■ 10.1 

1 : 

5J)5 

Interactioi;!:, - N X K. . .C. . 

' 7.4 

■ . 4' ■ 

1.85 

'P ’X K... t',.:.. /..e: 

. ’:io..9 ' 

.■.4 

3.98 

■'’■ N :x’ 

eota -■ 

4 

15.08* 


234 


TECBNIQUE IN AGEICULTUBAL BEBEAECE 

The other three ways in which the treatments may be allocated 
to the subblocks in order to confound two of the second-order 
interaction degrees of freedom are shown as follows: 


Siibblock a 

Siibblock b 

Subblock c 

No 

Ko 

Po 

No 

Ko 

Pi 

No 

Ko 

Po 

No 

Ki 

Pi 

No 

Kx 

Po 

No 

Kx 

Po 

No 

Ko 

Po 

No 

Ks 

Po 

No 

Ko 

Pi 

Ni 

Ko 

Pi 

Nx 

Ko 

Po 

Nx 

Ko 

Po 

Ni 

Kx 

Po 

Nx 

Kx 

Po 

Nx 

Kx 

Pi 

Nx 

Ko 

Po 

Nx 

Ko 

Px 

Nx 

Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Pi 

Na 

Kx 

Po 

No 

Kx 

Px 

No 

Kx 

Pa 

Na 

Ko 

Pi 

No 

Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Px 

No 

Ko 

Po 

No 

Kx 

Po 

No 

Kx 

Po 

No 

Kx 

Px 

No 

Ko 

Pi 

No 

Ko 

Po 

No 

Ko 

Po 

Nx 

Ko 

Pi 

Nx 

Ko 

Po 

Nx 

Ko 

Po 

Nx 

Kx 

Po 

Nx 

Kx 

Px 

Nx 

Kx 

Po 

Nx 

Ko 

Po 

Nx 

Ko 

Po 

Nx 

Ko 

Px 

No 

Ko 

Po 

No 

Ko 

Po 

No Ko 

Px 

No 

Kx 

Px 

No 

Kx 

Po 

No 

Kx 

Po 

No 

Ko Po 

N2 K2 Pi 

No Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Pi 

No 

Ko 

Po 

No 

Kx 

Po 

No 

Kx 

Po 

No 

Kx 

Pi 

No 

Ko 

Pi 

No 

Ko 

Pa 

No 

Ko 

Po 

Nx 

Ko 

Po 

Kx 

Ko 

Po 

Nx 

Ko 

Px 

Nx 

.Kx. 

Px 

:.Ni 

Kx 

Po 

Nx 

Kx 

Po 

Nx 

Ko 

Po 

Nx 

Ko 

Pi 

Nx 

Ko 

Po 

No 

Ko 

Px 

No 

Ko 

Po 

No Ko 

Po 

No 

Kx 

Po 

No 

Kx 

Pi 

No 

Kx 

Po 

No 

Ko 

Po 

No 

Ko 

Po 

No 

Ko 

Px 


In the tomato experiment there were two complete replications 
of the 27 treatment combinations, and the same allocation 
of the treatments to the three subblocks was used for each 
replication* Actually, when there is more than one complete 
replication, it is generally considered better to vselect, from the 
four optional arrangements, a different allocation of the treat- 
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'merits to' tlie siibbloeks for cmcli available replication. This in 
really' 'partial coiifoyiidingj as a different portion of t'he second” 
order interaction will be eonfoiiiided in' each, complete block 'or 
replication* : Provided tlie secc>nd”Drder interaction, variance, is 
bulked: in' with ■ the error^ the aiialyms of variance of 'tlie data 
will not be altered by this povtial eoxifouBding'ti3chmqiie. ■ .It will, 
be noted tha,t it 'is only tlie 'secsm.d.-order iiiteractioii ' that has 
been conf oiiiided* , and in agricuii4i.ra.t' expcriiiiraits iii generalj it: 
,iB: iisiially advi,sable to leave tlie- main effects 'and the first-order' 
inte.ra,c*lioiiE ii:?i{*onfoiiiided ai::id Jiinit llic eon.fonnding to the 
higher ()rdar intemctio:n effects. 

StJ'BDlV'IS'IOM OF THE TEEATMEKT RESPONSES 'IH ' ■ ' 

■A 33 EXPERIMENT ■ 

When any .factor is included in an experiment at three, different 
levels^ it is |:>ossible to split up 1::he treatnient xnariancer tex two 
■components represen.ting (a) tJ.ie ■ linear response d.ue' to the 
difference between, the .ext r(a.ne levels and ■ (5) the curvature, or 
deviation of the intermediate level from tins linear response.' 
Each component accounts for one of. „ the available' degrees T)f 
.freedom. "Yates ' giveS'- ' 'a 'simple '. inetho,d of'' evaluating, '.'those. 
eompO'iients for the ' Uiain effects,: ' 

. , . If ao, ,^'2 represent tlie, indiiddiial 'plot 3 ri.elds' of the 'factor -jl 

inciuded at tile tlmxe levels 0,^ respectively, "" 

, Linear respoirsc^. — , 

Curvature _ 


where ft' HT^'resents :tim' number of-' p^ any, of the; treatment^ 

totals'' evaluated irrthe. above :ex,p'ressiom ■ The .liiicar .response 
formula- is, of,', course, :unerely ': another version 'of that -^aMady: 
given in Chap. II for (waluating a.ny treatment sum of squares 
dependentun^-oidy ' t'W’o' totals.-' 

,- Applyin.g tMs . teehiiique, to. tlie^inato treatinent:' effects -off the 
tomato .manurial experi.inent''(Tablo'"92)-',fc>r' the, main effect K,: 

Linear response 6,25 with 1 degree of freedom 

. . A -X .. 1 , 0 - . ■ 

„ ; (150 - 2 X 147 -f 135)' 

. : Curvature 


0.75 with IVdegrae- 
of freedom 
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The aggregate of the two components adds up to the total 
sum of squares for the main effect of potash, as given in the 
original analysis of variance Table 93. The linear component 
accounts for the major portion of the response to potash. With 
an error variance of 4.77, this linear variance is still nonsignificant. 
It is not impossible, however, for the linear response to be signifi- 
cant wlien the main effect as a whole is nonsignificant.- Where 
there is any apparent response to increasing rates of any factor, 
this subdivision of the treatment variance shoiiki certainly be 
carried out. With tlie nitrogen and phosphate factors (Table 92) , 
tlie higher levels have given lower yields than the lower levels, 
and the analysis to the two components will not be of any value. 
The formulas still apply, however. For example, 

N, linear response 

Curvature 


i; r , It is also possible to split up the first-order interaction effects 

■ ; between the linear response and curvature factors. This process 

I ;; ■ is not of such general utility, and as its exact significance is 

; / rather difficult to explain in simple terms, it is not proposed to 

f - elaborate it liere. 

; A 33 EXPERIMENT WITHOUT REPLICATION 

: The 3^ factorial experiment is a particularly useful design, as 

, ; .i; it allows three factors to be tested in all combinations at a 

j ■ ; sufficient number of different levels to show up aii}^ definite 

treatment responses. It is especially adapted to fertiliizer experi- 
ments, as th(5 thrc^c main types of fertilisier — N, P and — can 
all be included and their interactions and optimum rates deter- 
mined. A very useful form of the 3®. experiment is the one 
limited to a total of 27 plots with the second-order interaction 
confounded in the subblocks. As there are 27 different treatment 
cambinations, this means that thex*e will be no replication of any 
one treatment type. The estimate of the error variance is 
derived from the remaining 6 degrees of freedom of the second- 
order interaction after eliminating the two confounded degrees 




=401 main 
2X18 /effect,N = 

_ (130 - 2 X 184 + 118)2 _ y 137.3 (as 
- 6X18 y originally 

133 3/ 
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{r?ame degree of preeisioii as would be oBtained wliea two or more 
replications of .each treatment series- is inehided. It does make 
it. possi'ble to lay dowii a fairly erjiiipreheiisive type of experimeiifc 
on a small' aereage and atVrelatirdy low, cost. Several -Hueb 
iioii,re|>Iicated. .experiments at different -centers: w:ould probably 
■be' much' more infoniiaiive thair a sirigie' large-scale experiment 
costing ■ ..about the saine amoiiiit Irait locaited . only one 'soil - 
type. 

COMPLEX COHFOOTBED BESI0HS 

. 1 . 1-10 principle of coTrfoTi;i:t.di.ng can l)e a]:>pliecl tc) m.any - other 
factorial designs :iTicliiding 2'V 3't:.4.v 3 X 2.:X 2^ etc.,' treatment 
combmations. . ,Mnny of these forms.' have been elaborated ;.by 
Yates.'^ With 1'iigh degrees . :of confoniidiiig, .the, statistical 
analysis tends to become somewhat .i:nvolvedj especially when the 
facto,rs are included at', varying levels 'as in .a. 3 X '2 X 2. expert, 
ment. .The principle has also been ''adapted to the Latin square^ 
though its application in tins direction is of ‘necessity very mircb' 
more limited. A very useful form cif this in a 3'^ factorial experi* 
.merit is a 9 X 9 Latin square' in whi-eh. the, seepnd~oixler'iri'teractiim.: 
is confounded with both ■the: rows and .the .columns, of .the' square, ' 

TabIb 94.-- A' 9 'X' '9 QtT.A'.si-LA,.vr.N' Squaee r.OE a .S^-FA crioiUA^^ Exfebimenx : 

Inveeactiox CoNFOTTHiOEri m the Rows .axi> ... 

'■ ',"t OoLtiMNS'' ^ .'■I ^ 


* The Design and Analysis of Factorial Expenments, Imp.. Bur, Bml Sci, 
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A square of this type is termed a quasi-Latin square. Any of the 
alternative arrangements already enumerated for the confound- 
ing of the A X B X C interaction in a 3^ design may be used to 
construct such a square. Yates gives the following design. The 
three factors concerned are A, and C, each at the three levels 
Oj 1^ and 2. It will be assumed that the numbers tabulated in 
each plot represent, from left to right, the levels of the three 
factors in the order A, 5, C, as entered in Ml in the first row^ 
Thus, the treatment in the bottom right-hand corner plot is 

Provided a new randomimtion of the complete rows and the 
complete columns is effected on each occasion, this design may be 
used as a standard form for fixing the layout of a field experiment 
of this type. If one assumes that the second-order interaction 
effects may safely be bulked in with the error sum, of squares, 
the analysis of variance is perfectly simple, being of the type: 

Degrees of 
freedom 


Total SO 

Rows 8 

Columns 8 

Treatments: Main effects 6 

Ist-order interactions 12 

Error. 46 


Confounding in Split-plot Design. — Another useful form, 
involving confounding in a Latin-square layout, is' the one 
applicable to the split-plot experiment in which there are six 
whole-plot treatments, A, ,B, Cj D, 2?, and F in a 6 X 6 Latin 
square, combined with subplot treatments of the 2X2X2 
factorial type. There will then be eight subplot treatment 
combinations, which will be most easily comprehended by using 
symbols appropriate to a 2^ fertilizer experiment with each of 
the three main fertilizers at tw'o rates. On this assumption, the 
subplot treatments would be of the type 

(I) n, p, k, npk 

(II) 0, np, nk, pk 

By splitting these eight combinations into two groups of four 
as shown above in the lines I and II, the second-order interaction 










■ ■ BEfEWPMENTB iN.'FlEEB, EXFBBmENTATION: ^ ' 239 ; 

■N X P X E:is confounded with the groups* This fact can be 
utilized ia the apiit-plot expeiimeiit :hy aubdi'V'idin.g each whole 
plot to four subplots and allocating one or other gmup of sub- 
plot treatineiits to each whole . plot in accordance . with. ' the 
: ap|>emied' design. . 




Each square oU' the diagram represents a wdiole ■ plot ^ arid :, 
the letters specify the whole-plot treatment. ' The 'prefixes 
I or ' II attached to the . letters indicate the particular group 
of subplot treatments 'which should be randomized among the 
four vsuiiplot units to wdiich this whole plot is split up. With 
this design the seeond-ord<?r subplot treatment interaction 
N X P X K is corifourided : with the eolnrans, and the third- 
.'order interaction betwa?en tlie wdiole-plot and siibi'dot. troafmen 
is .confounded wit'h.tlie rows., . ThevStatistMml' analysis:^ igmwmg’;^ 

' the two eoiif ounded ' treatment' effects,, will be of' .the . type .tab* 
^'ulated,on:page 240. ,'' ' ; 

Tlie general princi|)le of confounding eerlain ^subploi IrealnKait 
effects in a'^split-plot experiment, is.:.wortli :iiotingj’, as,:it , malcesTt :- 
'.'possible; to ^'adopt a relatively ,^sm, alb w}iole-plot''unit^ 
siz^3 of till? whole-plot blocks^ and iiicreaseH'the nuinbc^r of'tvhole- 
' plot ' treatment ' mplications.' ' ' : It therefore' overcomes some ;of 'thet 
disadvantages associated with the split-plot design. 

. ..The practice of confounding, applied -do. field'/experiments- hM 
^been '.discussed at' some .length', "as it."appears to be^'the' directioii;':' 
;from'',wMeh;the greatest immediate, improvement In .experimental ^ 
'. design'" may,,' be "expected. .; '..The.' ’advantage': of ■ confoiim 
in the practicability of planning an efficient experiment involving 
several treatment series in all combinations on a relatively small ,'. 
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Whole plots/. 

Rows . . ....... ... . . . . . . 

ColwmnSy. 

Whole-plot treatments (Tf) 
Error {a^. ... . .......... 

Total subplots. . . , ...... . . . 

Subplot . treatmeBts: 
Mai'ii efiects. . . . . 


Degrees of 
freedom 
..... 35 
S 

..... '5 

.5 

..... ' BO. 
.. ... 143 


3 


First-order interactions N X P? N X K, P X K — ... 3 

Interactions: 

Whole-plot treatment (If) X subplot treatment 

If X main effects , 15 

If X first-order interactions* 15 

Error (6)..... 72 


area. The introduction of the subblock arrangement associated 
with the confounded experiment ensures better control of the 
soil heterogeneity factor than would otherwise be possible wdth 
the large blocks necessary to a nonconfounded experiment of 
the same general type. On the other hand, confounding is 
only practicable in complex experiments of a rather specialized 
character, and it is a system, that can be wholeheartedly recom- 
mended only when the man in charge is sure of his technique both 
in the field and also in the statistical office, where the final 
evaluation of the data will be effected. 


VALE0ICTORY REMARKS 

It is considered that further discussion of more complex 
exi)erimGntal designs would be definitely out-of-place in an 
elementary textbook on applied statistics. It is hoped that 
sufficient examples have been given to demonstrate that for 
established forms of experiment the statistical calculations 
may be reduced to a simple routine which leaves no excuse for any 
ambiguity in the interpretation of the results. 

In 1849, in the preface to his “Experimental Agriculture,'^ 
J, F, W. Johnston wrote: 

■ ; It fe only by means of eonjoined experiments in the field, the feeding 
house and the laboratory — all made with eqml care, comdentiousmss and 




i>evelopments m field expeeimentation 


advwi'erl agricultnn! can hereafter bo with certainty 

ought to advance the more heartily now wo have found it. 


Ihc following quotation from the title page of the earlier 
numbers ot the Jotirnal of the Royal AyrkultHntl fioeMy of 
England m also very apt: 

_These experiments, it is true are not easy; .still they are in tlie power 
ot every thinlang husbandman. Ih- who ac.aunplMms but one, of 
However muted ajiplication, ami takes rare to n port it fuithfully, 
a(tvanoe.s the subject and con.scfi«<mily, the praeticio rrf agriculture, arid 
acquires thereby a right to the gratitude of hi.s fellow, s, ami of those who 
comt tr, . . . The first care ot all .societies formed for the improve- 
meir o om science should lie to prepare th<! forms of such cxperimcnt.s, 
and to distribute the execution of these among their members. 


^ r ns last sentence effectively summarizes the author’s aim 
m preparing this elementary exposition of .statistical methorls, 
which it is hoped may ultimately jirovc of some value in the 
distribution and eoixect application of certain of the cxi.sting 

forms of experiment. 
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For larger values of n, the expression — 's/^n I may be used as a normal deviate with unit variance. 

* Eeproduced by kind permission of Professor R. A. Fisher and of his publishers, Alessrs. Oliver & Boyd, Edinburgh. 
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* Reproduced from ‘'Logarithmic and Other Tahlee/' by Ifrank Castle, by kind per- 
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Tajblk VL 


Values of the number of degrees 


I 0.05 0.01 0.05 0.01 0.05 0.01 0.05 0.01 0.o| 0.01 0.05 0.01 0.05 0.01 ' 


4,052 200 4,990 210 5,403 225 5,625 230 5,704 234 6,S59 237 5,92S 

98.40 10.0(1 99.01 19.16 09.17 19.25 90.25 19.30 99.30 19.33 90,33 19.36 99,34 

34.12 9.55 30.S1 9.2S 20.46 9.12 2S,71 9.01 28.24 8.94 27.91 8.88 27,07 

' 21.20 6.94 18.00 0.59 16.69 6,39 15.98 0.25 15.52 6.16 15.2 1 6.09 14.98 

16.26 5.79 13.27 5.41 12.06 5.19 11.39 5.05 10.97 4.95 10.67 4.88 10.45 

13.74 5.14 10.92 4,7il 9.78 4.,53„^JKj[^ 4.?fe 8.75 4.28 8.47 4.21 8.20 

12.25 4.74 9.55 4.35 8.45 1.12 7.85 3.97 7.46 3.S7 7.19 3.79 7,00 

n,26 4.46 8.66 4.07 7.59 3,84 7,01 3.69 0 03 3.5S 6.37 3,50 6.19 

10.5q 4,26 8.02 3.86. G.99 3.63 6.42 3,48 6.06 3.37 6.80 3,29 5.62 

lO.Ol 4.10 7.56 3.71 6.55 3.4S 5.99 3.33 5.64 3.22 5.39 3.14 5.21 

9.65 3.9S 7.'20 3.59 6.22 3.30 5.07 3.20 5.32 3.09 5.07 3.01 4,S8 

9.33 3.8S 6.93 3.49 5.95 3.26 5.41 3.11 5.06 3,00 4.B2 2.92 4.65 

9.07 3.80 6.70 3.41 5.74 3.18 5.20 3.02 4.S6 2.92 4.62 2.84 4.44 

8.86 3.74 6,51 3.34 5.56 3.11 6.03 2,96 4.09 2.S5 4,46 2,77 4.28 

-,468^ 3,68 6,36 3 29- 5.4^ 3.06 4.89 2.90 4.56 2.79 4.32 2.70 4.14' 

SlS'T.er 3.24 5.29 2.85 4.44 2.74 4.20 2.66 4.03' 

8.40 3.59 6,11 3.20 5.18 ioo" 4.67 2.81 4.34 2.70 4.10 2.62 3.03, 

8.28 3.56 6,01 3.16 5.09 2.93 4.58 2t77 4,25 2,66 4.01 2.5S 3.85 

8.18 3.52 6,93 3.13 6.01 2.90 4.60 2.74 4.17 2.63 3.94 2.56 3.77 

8.10 3.49 5.85 3.10 4.94 2.87 4.43 2.71 4.10 2.60 3.S7 2.52 3.71 

8.02 3.47 5.78 3.07 4.87 2.S4 4.37 2.68 4.04 2.57 3.81 2.49 3.65 

7.94 3.44 5.72 jy)5^ fS2 2.82 4.31 2.66 3.99 2.55 3.76 2^71 3,69 

7.88 3.42 5.66 3.0r“4.76 2.80 4.36 2.64 .3.94 2.53 3.71 2.45‘^£54 

7.82 3.40 5.61 3.01 4.72 2.78 4.22 2.62 3.90 2.51 3.67 2.43 3.50 

7.77 3.38 5.67 2.99 4.68 2.76 4.18 2.60 , 3.80 2.49 3.03 2.41 3.46 

7.72 3.37 5.63 2.98 4.64 2.74 4,14 2.59 TsF 2,47 3.69 2.39 3.42 

7.65 3.35 5.49 2.96 4,60 2.73 4.11 2.57 3.79 2.46 3.56 2.37 3.39 

7.64 3.34 5.45 2.95 4..57 2.71 4.07 2.56 3.76 2.44 3.53 2.36 3.36 

7.60 3,33 6.42 2,93 4.54 2.70 4.04 2.64 3.73 2.43 3.50 2,35 3.33 

7.50 3.32 5.39 4.51 2.69 4.02 2.53 3.70 2.42 3.47 2.34 3.30 

7.50 3.30 5.34 iSo 446 2.67 3.97 2.51 3,66 2.40 3.42 2,32 3.25 

7.44 3.28 5.29 2.88 4.42 2.65 3.93 2.49 3.01 2.38 3.38 2,30 3.21 

7,35 3.25 5.21 2.85 4.34 2.02 4.86 2.40 3,54 2.3.5 3.32 2.26 3.15 

7.27 3,22 5,16 2.88 4.29 2.59 (B.80 2.44 3.49 2.32 3.26 2,24 3,10 

ZM 5.10 2.81 4,34 2.57 3.76 2,42 3.44 2.30 3.22 3.22 3.05 

7.17 3.18 5.06 2.79 4.20 2.56 3 72 2^’ 3.^1 ^2,29 3.IS 2.20 3.02 

7.08 3.15 4.98 2.76 4,13 2.62 3.65 'ST' :l34 2.25 3.12 2.17 2.95 

6.96 3.11 4,88 2,72 4,04 2.48 3.56 2,33 3.25 2.21 3.04 • 2.12 2.b7 

6.90 3,09 4.82 2.70 3.98 2.46 3.51 2.30 3.20 2.19 2.99 2.10 2.82 

C.M 3.04 4.71 2.65 3.S8 2.41 3.41 2.26 3.11 2.14 2.90 2.05 2.73 

6.1 3,00 4.62 2.61 3.80 2.38 3.34 2.22 3.04 2.10 2.82 2.02 2.66 

6.64 2.99 4.60 2.00 3.78 2.37 3.32 2,21 3.02 2.09 2.80 2.01 2.64 


♦ Eeproduoed froM Methods/' by kind permission of the author, Profcsssor 0. W. Snedeoar 
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Table VII. — Table of the Minimum Number op Replicates Necessary 
TO Demonstrate Significant Treatment Differences at the 
5 Pee Cent Point 
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Correlation coefficients, caleulation 
of standard errors in, 113, 118, 
121 

x” test for homogeneity in, 117 
comparison of, 113 
estimation of significant differ- 
ences in, 114, 121 
Correlation diagrams, 103 
Correlation table, 110 
calculation of correlation coeffi- 
cient from, 110-112 
Covariance, 106, 148“151, 214 
adjustment of treatment means in, 
151 

degrees of freedom in, 149 
reduction of error variance by, 150 
tests of significance in, 150-151 
Covariance, analysis in field experi- 
mentSj 214-227 

adjusted treatment means in, 
215, 219, 223 

standard error of, 219, 223 
coefficient of regression in, 21S 
Critical difference, 41 

';/ ■ 

Decimals, elimination of, 27 
Degrees of freedom, number of, 4 
Deviation from mean, 4-5, 8 

(See also Standard deviation) 
Diagrams, S9ff. 
dot, 103-104 
quadrants in, 104 
essentials of good, 90 
• in form of solid models, 93 
, , (^ce edao . Graphs) 


Discrete variates, 2 
as in Poisson distribution, 88 
Distribution, binoniiah 72 
frequency, 8, 73, 98 
normal, 6, 9, 16 
Poisson, 88 

E 

Error, 12 

(See also Standard error) 

Error variance, 33 
in nonreplicated factorial experi- 
ment, 236 

reduced by analysis of covariance, 
149 

subdivision of, for data in sub- 
units, 53-56 
(See also Covariance) 

Errors of random sampling, 15, 31 

'ft 

F , 

Fj ratio of variances, 38 
compared with 0 and f, 44 
table of, 254-255 
F test, 38-39 

Factorial expeximent, 47, 179, 205 
without replication, 236 
with SHreatments, 236 
linear response in, 235 
Field experiments, 
accuracy of field teclmique in, 
157-160 

border effects in, 161 
(/Stce Borders) 
complex, 205 

(See also Complex experi- 
ments) 

designs in, 163, 180 
orthogonality in, 1 78-180 
(See also Randomized block; 
Latin square; Split plot; 
Confounded) 

grouping of treatment compari- 
sons in, 176 

important factors in designing, 
156 
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troritoioiits, cboico of, ftjr, 160 
Focbier gr^isses, tropix^al, 36--37 
iutrogeii coil tell t of, B6 
Foro'inlas, 2S 

for analysis of varijiDcej 67- 69, 

167, 172 ' : 

for' standard dBviation,'2S ■, 
for standard error, 29 : 

, -of a difference, 29 ■ 

Frecpiency diagram, '7, 9S 
Freqiitniey tafoie, .77 
■ ■ ealeidatiqti :of' standard deviation 
from, ■1,01 

■ by assuiiied-mean metliod,’ I..02- 

by llie Yajiubk<”-sciuarecl metbod, 
162 
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. -Genetics, problems’ In,. 81 jb; ' 

■ ; ■ : ealailatldi'i ■ of. 82, '85-87 

■' ■ ' ’ nlgebrak- exprcjssions '' nssed' in , 
'86 

' .Goodness',.of,fit, 70. .■ ' 

use of to determine, 70J^. 

': Graphs., 90-“91 
eohimnar, 91 
' v..'., plotting of , cAirve in, 

" 'selactloiipf scalg.lor 5 ; 00 

showing negatto correlation, 92 


Oropped data, aimlysis of, 201 
with ' different . asHiim.pd'' Biea-ns,. ■ 
.. 2 CIF - 2 ( H ' . ' 

.Grcmping of variates .in .toqiieney 

tabHiitoioo ^ ^ 

Gr0wt!rmea..BnreTneiits, 93 ' ^ 
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Histogram, 7, 92 , 

Hyperbolic logarithms, d2 .-. 
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Latin-square experimeut, analysis of 
A’-ariance of, 169-"171 
formulas for, 172 
quasi, 237 
replicated, 172 

analysis of variance of, 172-175 
standard forms of, 168 
Linear regression component of 
variance, 144, 147 
calculation of, 145-140, 153 
of treatment, 225-227 
deviations from, 225 
in 3^ factorial design, 235 
Logarithms, Napierian or hyper- 
bolic, 42 

table of, 252-253 


cot: 

est 

k 

Corn 

Corn 

Main elfects m treatment variance, 
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■ V ' 

46 

formulas for calculating, 68 

Covi 

Mean, 3, 97 

ad 

accuracy of, 12 


calculation of, 4-6 

de 

estimates of, 13 

ref 

adjusted by regression, 148 

te: . 

standard error of, 42 

Cov; 

general, 41 

Means, comparison of, 13 
(See also Significance.) 
Mendolian theory, 81 


Missing plot, estimating yield of, 

C( 

181 

■;Crii„ ';V'^'' 

.. . 

in liatin square, 184 
formula for, 184 

Dec 

in randomized block design, 181 
formula for, 182 

(See dsQ Incomplete records) 

Mode, 97 
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Napierian logarithms, 42, 252 


- Normal curve, 6, 8-9, 16, 97 


tail of, 10 


Observations, qualitative and qiian- 
titative, 1 
Odds against, 9, 15 
Orthogonality, principle of, 178- 
180,227 

as applied to confounded ex- 
periments, 227-235 


Partial confounding in held experi- 
ments, 230 

Partial correlation coefficient, 119 
computation of, 120 
formula for the, 119, 122 
degrees of freedom of, 120, 123 
determination of significance of, 
120,123 

in terms of Z, 121 
Partial regression, 155 
Perennial-crop experiments, 191-201 
border rows in, 193-195 
important factors in designing, 
192-195 

plant uniformity, 192 
records, 195 
size of plot, 193 

statistical analysis of data from, 
196-200 

Phenotypes in a poultry cross, 82 
Plots in field experiments, 161 
arrangement of, 162 
correlation in fertility of adjacent, 
162, 164 
number of, 165 
randomization of, 165 
size and shape of, 161 
split, 209 

{See also Randomized block; 
Latin square) 

Poisson distribution, 88 
Population, statistical, 1 
variation in, 2 

Poultry, genetical problems with, 
S2ff. 

Precision, experimental, 63 



1 






INDEX ■ 
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ProlMibility, 9 
5 i)er rent point, 15 
integral, 10 
Proliable error, 23 


Q, liar tiles, 23 
Quasi “I.atin square, 237 
analy.siH of variance of, 238 


Eandomized block experiment, 164 
analysis of variance of, 160 
formulas for, 166 
error variance in, 106, 179 
Range of variates, 4, 11 
Regression, ISOJf. 
coefiicient of, 131 
calculation of, 131-134, 139 
degrees of freedom of, 139 
cBtimation of significaneci of, 
13B-141, 146 

independent estimates of, 142 
significant difference Ise- 
tweeii, 144 

standard error of, 139-142 
as tangent of line of linear 
regresHion, 137 
equation of, 13 J, 135 
T : for': curved .regression: linoB,.' 1:55 ■■ 
graph of, 131b 131, 137 
line of brjst fit if», 134 
line of linear, 139-131, 135 
: : : . deviation . from, ^ 1 44^- 1. 47, . 153' ' /.■ ; : ■ 

.' ■■ ■ ineasureinani ' of ■ tend; bl . Variates,: 

partial regreBsioii, 155 
' reduction of error variance by, 1.46 
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Repiicates, number of, for different 
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Sampling, correct, ,16 
Seasonal'- effects .in field experi-' 

. ' ments, 156, 176," 

Serial exj>erinitjnts, 189 

■ statistitmi aufdysk'of, 189-191 - 
Significance, Ixitwecn a nuiubcr of 
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in small Htunples, 18 
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Bpifbplot experijnent, 209 
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Standard dc,viatlmf, 3 
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Standard error, of correlation coeffi- 
cient, 108 

of difference between means, 13- 
14 

of difference between totals, 27 
short methods of calculating, 27 
“ Student^s” method of statistical 
analysis, 19-22 

Subunits, analysis of data divided to, 
52-58, 106-200 
Sum of products, 106 
short methods of calculating, 
110-112 

correetion factor in, 110 
Sum of squares, 6 
evaluated in class intervals, 101 
short methods of computing, 
24-26 

for variates in arithmetical 
progression, 141 

■ ■■ 'T 

17 

compared -with F and Zj 44 
number of degrees of freedom of, 
18 

table of, 248 

Tables, statistical, 247-256 
250 

F, 25^1-255 

Napierian logarithms, 252-263 
ntiinber of replicates for sig- 
nificance, 256 
t, 248 
a;, 247 

Zf 5 per cent points, 249 
Treatment variance, 34 
analysis of, 45ff, 
grouping of treatments hi, 176 
with varying treatment repli- 
cations, 60 

interaction component of, 46 
calculation of, 48 
linear regression component of, 

. in 3® factorial design^ 235 t236 


Treatment variance, main effects in, 
.46 

in 3® factorial experiment, 235-236 
Treatments, factorial arrangement 
of, 47 

statistical comparison of several, 
40-42 

Uniformity trials, 168, 215 
to control plot variation by covari- 
ance technique, 221-225 
(/See also Covariance) 

, . V 

Valedictory remarks, 240 
Variable, Iff, 

Variable-squared method of calcu- 
lating standard deviation, 26 
Variance, 6 
analysis of, 31jf. 

component factors of, 31 
for data divided into subunits, 
53-56, 196-200 

by variable-squared method, 
35 

linear regression component of, 
144-146, 155 
between series, 31 
calculation of, 32-33 
within series, calculation of, 31 
treatment of, 34 
linear regression component of, 
225-227, 235-236 

Variances, comparison of by z or F 
tests, 43 
Variate, Iff. 

continuous or discrete, 2 
subdivision of, 52 
Variation, 2 
coefficient of, 22-23 

W 

Whole plots, 209 

(See also Split plot) 
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Whole; units, siibdhiBion of^ 52-58, 


(Bee aim Subunits) 


conipfired witfi F and t, 44 
simple metliod of ealeukting, Bfm. 
talde of 5 per emit distribution of, 





